Configuring Spark Settings for Jupyter NotebooksΒΆ

By default, the cluster-wide spark configurations are used for Jupyter notebooks. You can specify the required Spark settings to configure the Spark application for a Jupyter notebook by using the %%configure magic.

Note

You can configure Spark settings only for Jupyter notebooks with Spark kernels.

You should specify the required configuration at the beginning of the notebook, before you run your first spark bound code cell.

If you want to specify the required configuration after running a Spark bound command, then you should use the -f option with the %%configure magic. If you use the -f option, then all the progress made in the previous Spark jobs is lost.

The following code shows an example specifying a Spark configuration.

%%configure -f
 {"executorMemory": "3072M", "executorCores": 4, "numExecutors":10}

Note

The Spark drivers are created on the cluster worker nodes by default for better distribution of load and better usage of cluster resources. If you want to execute the Spark driver on the master, contact Qubole Support.