Choosing between a Cross-account IAM Role and Dual IAM Roles

QDS supports cross-account IAM Role-based authentication and dual IAM Roles-based authentication, which are briefly described in the following section.

Cross Account IAM Role

Qubole allows you to configure a cross-account IAM role which is recommended over using AWS Keys to interact with AWS resources.

Set up AWS integration with Qubole with cross account roles by following steps provided in Configuring the Qubole Data Service which provides a step-by-step procedure on creating IAM roles on the AWS end as well as the Qubole end. Using these steps, you can provide better security than with AWS IAM keys. However, the pitfall with this approach is that Qubole would have access to the Amazon S3 buckets that may contain sensitive data. So, if you are concerned about security, you can create an additional IAM Role that would have access to the sensitive data as described in Dual IAM Role.

For more information, see Managing Roles and Configuring the Qubole Data Service/Creating a Cross-account IAM Role for QDS.

Dual IAM Role

You can use Dual IAM Role-based authentication if you want one role to access sensitive data.

In this IAM Role-based authentication, you must have two IAM Roles, let us call the two roles as Role A and Role B. Configuring the Qubole Data Service provides a detailed step-by-step procedure on how to go about creating IAM roles on the AWS end as well as the Qubole end.

Role A acts as the cross-account IAM role while Role B has access to Amazon S3 buckets containing sensitive data. As Qubole cannot assume Role B to access Amazon S3 data, it provides much more data security. Use the steps described in Creating Dual IAM Roles for your Account to update this Role on clusters that need access to sensitive data. With this approach, you can restrict access to data to only those commands and queries that have to operate on this data and deny all action access to all Qubole users in the account.

For more information, see Creating Dual IAM Roles for your Account.

However,this approach has the following limitations:

  • Only Hive on coordinator works.
  • Explore and Scheduler dependencies do not work. The dependencies only work over the default location or Amazon S3 path to which the cross-account IAM role has access to.
  • Importing and exporting data do not work.
  • An Amazon S3 path in commands works only when the path is accessible to the cross-account IAM role.
  • Data in results and logs are accessible as the default location stores them.