Enabling Client-side Encryption (AWS)

Qubole supports AWS Key Management Service (KMS) client-side encryption only on the S3a filesystem. It is supported on Hadoop 2 and Spark clusters. It can be used to encrypt/decrypt data. Qubole supports KMS client-side encryption on Hadoop, Hive, and Spark engines.

Qubole supports AWS KMS client-side encryption at account and cluster levels. If the service is enabled at an account level, it gets enabled on all Hadoop 2 and Spark clusters of that QDS account.

Note

The AWS KMS client-side encryption feature is available for beta access. To enable it on a QDS account or in a specific Hadoop 2/Spark cluster, create a ticket with Qubole Support.

Since AWS KMS is only supported on the S3a filesystem, the account must have the S3a filesystem. Ensure that the Amazon S3 bucket and the AWS KMS key are in the same AWS region because the key of one AWS region is not recognized in other AWS regions.

You can enable KMS client-side encryption at a cluster level by adding fs.s3a.awsKmsCmkId=<kms key> as a Hadoop override on the cluster’s configuration either through its cluster UI > Advanced Configuration or through the cluster REST API call. You must restart the cluster after adding this override for the setting to be effective on that cluster.

However, enabling the client-side encryption at the cluster level has this disadvantage:

  • If the KMS key is not stored in the Qubole end (and is only present as Hadoop overrides), then Qubole cannot decrypt results in the Results tab on the Analyze UI for queries for which data is directly read from AWS S3. This occurs as the KMS key is not stored the Qubole end.