Understanding Data Encryption on QDS¶
Qubole always caches metadata, command results, and the notebook paragraphs-results. Notebook paragraph-results and commands are stored in Amazon S3 and cached on Qubole serves.
- Metadata for one day that is 24 hours
- Command results for 7 days
- Notebook paragraph-results for 30 days
Encrypting Cached Data¶
Create a ticket with Qubole Support to enable encryption of results while fetching them from the object storage, though it might slow down the data retrieving process as QDS would not be caching the results onto cache.
There is no option to disable Metastore caching and read-only Notebooks are always cached with encryption on.
Encrypting Data on Amazon S3¶
Qubole supports securing data on Amazon S3 through encryption mechanisms. It supports the server and client side encryption as described below:
- Amazon S3 server side encryption (SSE) in which the data is encrypted before it is saved to disk in S3 and decrypted when it is read. This encryption and decryption takes place in the S3 infrastructure, and is transparent to (authenticated) clients. Enabling Encryption for Data at Rest (AWS) describes how to encrypt data on Amazon S3 by configuring different types of options. On the S3a filesystem, Qubole supports encrypting through SSE-KMS and SSE-C as described in Enabling KMS and Customer Provided Keys Server-side Encryption on the S3a File System.
- Amazon S3 client side encryption (CSE)in which the data is encrypted and decrypted on the client, that is on the cluster. Enable AWS Key Management Service Client-side Encryption on the S3a File System describes the AWS Key Management Service, a beta feature that is used to protect the client-side data through encryption.
Encrypting Data on Ephemeral HDFS¶
On the QDS UI, encrypting data on Ephemeral HDFS is supported as an option as described in Enable Encryption on Ephemeral HDFS through QDS UI.
To enable encryption on the ephemeral drives through a Cluster REST API, see security_settings.