Using Deep Learning Notebooks¶
As Python 3.5 is the default and supported version to use Python 2.7.13 on Deep Learning notebooks, go to interpreter setting and
zeppelin.pyspark.python property in the user-level interpreter to
Python is the default language on Deep Learning clusters and notebooks. In addition to
pyspark interpreter, Deep
Learning notebooks support other Spark type interpreters such as
There are two ways of using Deep Learning and they are:
Non-distributed Deep Learning: In this mode, each Deep Learning job uses resources of any one slave node. Multiple users can use the same cluster to run different jobs. The recommended way to launch Deep Learning jobs is through notebooks.
Distributed Deep Learning: This mode supports use cases where the size of data is so large that storage and computations cannot be handled by a single node. Currently, Qubole supports distributed mode only for TensorFlow using Yahoo’s open-source project TensorflowOnSpark. As the dynamic allocation is disabled by default in Deep Learning clusters, set these properties in the Deep Learning notebook’s Spark interpreter to enable dynamic allocation:
<the number of executors you want>
In this mode, each executor takes the entire slave node. Ensure that the cluster’s minimum and maximum nodes are set appropriately so that Deep Learning jobs can autoscale if required. For distributed mode, the maximum cluster size must be 1+ (one more) than the number of executors given as input to the TFOS job.