GitHub Version Control for Zeppelin Notebooks

To configure the version control for Notebooks using GitHub, you must perform the following tasks:

  1. Configure Version Control Settings
  2. Generate GitHub Token in the GitHub Profile
  3. Configure a GitHub Token
  4. Link Notebooks to GitHub

After configuring the GitHub repository, you can perform the following tasks to manage the notebook versions:

Configuring Version Control Settings

You must have Account Admin privileges to perform this task.

Prerequisites

For AWS, if you want to use Self-managed repository with Bastion node, you must set up the Bastion node by performing the following steps:

  1. Creating a VPC with Private and Public Subnets
  2. Configuring the Route Tables
  3. Creating a Security Group in the VPC
  4. Configuring a Unique SSH Key for Your Account
  5. Configuring a Custom SSH Key for the Bastion Node
  6. Bringing Up the Bastion Host

Perform the following steps to configure Version Control System:

  1. Navigate to Home >> Control Panel >> Account Settings.
  2. On the Account Settings page, scroll down to the Version Control Settings section.
  3. From the Version Control Provider drop-down list, select GitHub.
  4. From the Repository Hosting Type drop-down list, select Service-managed or Self-managed.
  5. For Service-managed, the API Endpoint is auto-populated. For Self-managed, enter a valid URL of GitLab Self-managed account. For Bastion node, select Use Bastion Node and enter the IP Address, User and Port.
  6. Click Save.

The following figure shows a sample Version Control Settings section.

../../../../../_images/vcs-github1.png

The following figure shows a sample Version Control Settings section with the Self-managed and Bastion node options.

../../../../../_images/github-self-managed1.png

Generating a GitHub Token in the GitHub Profile

As a prerequisite, you must get a GitHub token. Perform the following steps to get the GitHub token:

  1. Create a GitHub token by following the GitHub Documentation.
  2. Copy the generated GitHub token to configure it in the Qubole account.

Configuring a GitHub Token

You can configure a GitHub Token for notebooks at per user setting level from the My Accounts or Notebooks UI.

  • Configuring the GitHub token for notebooks for your account

    1. Navigate to Control Panel >> My Accounts.
    2. For your account, under GitHub Token column, click Configure.
    3. Add the generated GitHub token and click Save.

    The GitHub token is configured at per user setting level.

  • Configuring the GitHub token from notebooks:

    1. Navigate to Notebooks and click a notebook.
    2. Click the Manage notebook versions icon that is on the top-right of the notebook. The Versions panel expands as shown in the following

    figure.

    ../../../../../_images/ConfigGitHubinNote.png
    1. Click Configure now.

    2. In the dialog box add the generated GitHub token and click Save.

      The GitHub token is now configured for your account.

Linking Notebooks to GitHub

After configuring the GitHub token, you can link the GitHub repository from notebooks.

  1. Obtain the GitHub repository URL.

    1. Navigate to the GitHub profile and click Repositories.

    2. From the list of repositories, click the repository that you want to link.

    3. Copy the URL that is displayed within that repository.

      Alternatively, you can navigate to the GitHub profile and copy the URL from the browser’s address-bar.

      Note

      If you want to add HTTPS *.git link as the GitHub repository URL, click Clone or Download. A drop-down text box is displayed. Copy the HTTPS URL or click Use HTTP (if it exists) to copy the HTTPS URL.

  2. Click the Manage notebook versions icon that is on the top-right of the notebook. The Version button expands as shown in the following figure.

    ../../../../../_images/LinkGitHubVersion1.png
  3. Click the Link Now option.

  4. In the Link Notebook to GitHub dialog box, perform the following actions:

    1. Add the GitHub repository URL in the Repository Web URL text field. Ensure that the GitHub profile token has read permissions for the repository to checkout a commit and write permissions for the repository to push a commit.

    2. Select a branch from the Branch drop-down list.

    3. Add an object path file in the Object Path text field.

      A sample is as shown in the following figure.

      ../../../../../_images/LinkNotetoGitHub.png
    4. Click Save.

Pushing Commits to GitHub

After you link notebooks with a GitHub profile, you can start using the notebook to push commits to the GitHub directly from a notebook associated with a running cluster.

Before you push the commits, ensure that the following requirements are met:

  • The GitHub profile token must have write permissions for the repository to push commits.
  • The associated cluster must be running.
  1. Click the Manage notebook versions icon that is on the top-right of the notebook. It expands and provides the version details.

  2. Click the Push icon to commit. A dialog opens to push commits. The following figure shows the version details and the Push to GitHub dialog.

    ../../../../../_images/PushtoGitHub1.png
  3. Add a commit message and click Save to push the commit to the GitHub repository. You can use the option force commit to force push over the old commit (irrespective of any conflict).

Note

Qubole does not store commits or revisions of notebooks. However, commits or revisions of notebooks can be fetched from users’ GitHub account whenever required.

Restoring a Commit from GitHub

  1. Click the Manage notebook versions icon that is on the top-right of the notebook. It expands and provides the version details.
  2. Select a version from the list and click Restore to checkout that version.
  3. Click OK to checkout that version in the confirmation dialog box.

Note

Qubole does not store commits or revisions of notebooks. However, commits or revisions of notebooks can be fetched from users’ GitHub account whenever required.

Creating a Pull Request from Notebooks

  1. Open the required notebook.

  2. Click on the Gear icon on the top right corner of the notebook, and select Configure GitHub Link. The Link Notebook to GitHub dialog is displayed.

  3. Click on the Create PR hyperlink.

  4. Proceed with the steps in GitHub to create the PR.

    For more information, see GitHub Documentation.

Resolving Conflicts While Using GitHub

There may be conflicts while pushing/checking out commits in the GitHub versions.

Note

You can use the option force commit to force push over the old commit (irrespective of any conflict).

Perform the following steps to resolve conflicts in commits:

  1. Clone the notebook.
  2. Link the cloned notebook to the same GitHub repo branch and path as the original notebook.
  3. Checkout the latest version of the cloned notebook.
  4. Manually port changes from the original notebook to the cloned notebook.
  5. You can commit the cloned notebook after porting changes.