Enabling SSE-KMS

Amazon S3-KMS Managed encryption keys (SSE-KMS) is one of the types of the server-side encryption that AWS supports.

For details on the client-side KMS encryption, see Enabling Client-side Encryption (AWS).

Enabling SSE in the QDS Control Plane

The QDS Control Plane denotes all components except the clusters. Understanding the Qubole Folders in the Default Location on S3 (AWS) provides the list of folders in the account’s default location into which QDS has access to write data.

Currently, QDS allows you to enable the SSE-KMS only through a REST API call as described in Enable SSE on the QDS Control Plane.

Enabling SSE in Hadoop and Spark Clusters

As a prerequisite, you must enable SSE in the QDS Control Plane as described in Enabling SSE in the QDS Control Plane before enabling SSE in Hadoop or Spark clusters.

To enable SSE-KMS in Hadoop and Spark clusters, perform these steps:

  1. Navigate to the Clusters page, click Edit to edit an existing cluster or click New to create a new cluster.
  2. In the cluster’s Advanced Configuration tab, under Override Hadoop Configuration Variables, add fs.s3a.server-side-encryption-algorithm=SSE-KMS.

The same syntax is applicable on Hive commands, which is set per command and in the same command session as the command.

For example,

CREATE EXTERNAL TABLE New2 (`Col0` STRING, `Col1` STRING, `Col2` STRING) PARTITIONED BY (`20100102` STRING,`IN` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 's3://ap-dev-qubole/common/hive/30day_1/30daysmall';fs.s3a.server-side-encryption-algorithm=SSE-KMS;

Enabling the Encryption Key

Set the following properties to use the SSE-KMS on the S3a filesystem:

  1. fs.s3a.server-side-encryption-algorithm=SSE-KMS.
  2. fs.s3a.server-side-encryption.key=<key>: It is the encryption key to be used for encrypting the data. If you leave this property empty, the default S3 KMS key is used. Set this property to the specific KMS key ID if you do not want the default S3 KMS key.

Enabling SSE-KMS while using Hadoop DistCp

While using Hadoop DistCp, these parameters can be set for server-side encryption along with the other parameters:

  • s3ServerSideEncryption: It enables encryption of data at the object level as S3 writes it to disk.
  • s3SSEAlgorithm: It is the algorithm used for encryption. Specify SSE-KMS as its value. If you do not specify it but s3ServerSideEncryption is enabled, then AES256 algorithm is used by default.
  • encryptionKey: It is the key used to encrypt the data. If the algorithm is SSE-KMS, the key is not mandatory as AWS KMS would be used.

Enabling SSE-KMS in the Presto Cluster

Perform these steps to enable SSE-KMS in Presto:

  1. As a Presto catalog/hive.properties setting, set hive.s3.sse.enabled=true.
  2. You must set the type of encryption to KMS as mentioned here:
    • Set hive.s3.sse.type=KMS for Presto 0.180 or later versions.
  3. Set the KMS key by using the hive.s3.sse.kms-key-id property. For example, set hive.s3.sse.kms-key-id=<KMS Key ID>. This step is optional. If you do not set the KMS key, then the default key is used.

For more information, see catalog/hive.properties.