Using the User Interpreter Mode for Spark Notebooks

Qubole supports legacy and user interpreter modes in a Spark cluster. A system administrator can configure the mode at the cluster level via the QDS UI or the REST API.

About User Mode

In user mode, QDS Spark cluster provides a dedicated interpreter for each user who runs a notebook:

  • The interpreter is named as follows: user_<user's_email_name>_<user's_email_domain> (user_ is a literal constant). For example, for a user whose email address is abc@xyz.com, the interpreter name is set to user_abc_xyz.
  • As the user, you can also create additional interpreters.

Note

The Spark Interpreters support per user AWS credentials when started in user mode. This feature is not enabled for all users by default. Create a ticket with Qubole Support to enable this feature. When this feature is enabled, all the new clusters are created in the user interpreter mode.

Using Your Interpreter

  1. From the main menu of the QDS UI, navigate to Notebooks.
  2. Choose a notebook from the left panel, or choose NEW to create a new one. Make sure the cluster on which the notebook runs is configured for user mode, as follows:
    1. Click the gear icon that appears when you mouse over the notebook name in the left panel, then choose View Details. This shows you the name and ID of the cluster.
    2. Pull down the Clusters menu (near the top right of the screen) and find the cluster.
    3. Mouse over the cluster name and click on the eyeball icon that appears on the right. The resulting page should show Notebook Interpreter Mode set to user. If it doesn’t, you can assign the notebook to another cluster (click the gear icon as in step 2a above and choose Configure Notebook); or your system administrator can configure user mode for this cluster.
  3. Click on the name of the notebook in the left panel to load it.
  4. Click the gear icon next to Interpreters to see the list of available interpreters.
  5. If your interpreter (named as described above) is not at the top of the list, click on it to highlight it, then drag it to the top of the list and click Save.

You are now ready to run your notebook with your interpreter. Remember that the Spark cluster must be up and running, as indicated by a green dot next to the cluster name in the Clusters pull-down list.

Creating Your Own Interpreters

When user mode is configured for the cluster, you can create your own interpreters in addition to the interpreter that is automatically created for you.

To create and use an interpreter:

  1. Choose a notebook and make sure user mode is configured for its cluster, as described in steps 1-3 above.

  2. Click the Interpreters link near the top right. The resulting page shows you the current set of available interpreters.

  3. Click Create to create a new interpreter.

  4. On the resulting page, name the interpreter and choose the type and properties as prompted, then click Save.

    If per user AWS credentials is enabled, then specify your email address for the spark.yarn.queue property to create a user level interpreter. You cannot modify the non-user level interpreter settings.

  5. The new interpreter now appears in the list of interpreters, with the properties you have defined. You can change the properties if you need to by clicking on the edit button on the right.

  6. Click the name of the notebook in the left panel to reload it, then configure the notebook to use your new interpreter as described in steps 4-5 above.

Using another User’s Interpreter

In user mode, interpreters can easily be shared. To use another user’s interpreter, simply drag it to the top of the list as described in steps 4-5 above.

Sharing Variable Settings

When you set a variable in one notebook, that variable will have the same value in all notebooks that use the same interpreter, even if another user is using the interpreter. For more information, see Notebook Interpreter Operations.

Effect of Existing Bindings on Interpreter Modes

When user mode is set for a Spark cluster:

  • When you run a notebook that you own, but that is bound to an interpreter in legacy mode, the notebook runs with that legacy interpreter. This is to ensure backward compatibility.
  • When you run a notebook bound to an interpreter owned by another user, QDS rebinds the interpreter to your interpreter and runs it.