Understanding Data Encryption in QDS

Command metadata and results are stored in your Cloud storage. To speed up access - by default - these are also cached on QDS servers.

Qubole caches the following information:

  • Metadata for one day that is 24 hours
  • Command results for 7 days
  • Read-only notebook content along with the results in encrypted form for 30 days

Encryption for Data at Rest on Azure

  • Data at rest is encrypted automatically in Azure Blob storage, and Azure does not provide a way to disable encryption.
  • Data at rest is encrypted by default in Azure Data Lake storage. If you accept the default, you can’t change the setting after setting up the account.
  • QDS supports the Azure encryption mechanisms; there is nothing you need to do in QDS.

Encryption on AWS

QDS provides encryption mechanisms to protect the data, as follows.

Encrypting AWS Cached Data

Create a ticket with Qubole Support to enable encryption of results while fetching them from the object storage, though it might slow down the data retrieving process as QDS would not be caching the results onto cache.

There is no option to disable Metastore caching.

Read-only notebooks are always cached with encryption on to provide fast offline access to the notebooks when the attached cluster is down. When the attached cluster of the notebook comes up or if the notebook is attached to another live cluster then the cache is discarded.

Encrypting Data on Amazon S3

Qubole supports protecting data on Amazon S3 through encryption mechanisms. It supports the server-side and client-side encryption as described in Enabling Data Encryption in QDS.

Encrypting Ephemeral Data on AWS Clusters

On the QDS clusters, you can encrypt data on Ephemeral HDFS as described in Enabling Encryption of Ephemeral Data in QDS Clusters.

To enable encryption on the ephemeral drives through a Cluster REST API, see security_settings.