Handling the AWS API Rate Limits

AWS imposes limits on number of API calls a user makes for various services such as Amazon Elastic Compute Cloud (EC2), Amazon Identity and Access Management (IAM), and Amazon S3 for each account for various reasons such as performance and security.

Qubole tries its best to work around these API limits by providing certain trade-offs. However, if an AWS account gets larger, the rate limiting is inevitable which requires users to split the workload in different dimensions such as time, region, and new account.

In an attempt to handle rate limits, Qubole has done multiple optimizations in the orchestrator to reduce number of API calls which helps alleviate the API rate limiting problem.

Here are the optimizations that QDS has done:

  • QDS uses TagOnCreate while bringing up EC2 instances to reduce the CreateTag API call. This is currently supported for OnDemand and SpotFleet instances. If you are using heterogeneous clusters, ensure that the qubole-ec2-spot-fleet-role IAM Role has permission to attach tags to instances. Note that AmazonEC2SpotFleetRole (AWS managed policy) does not have the permission to tag instances. A simple way to ensure sufficient tagging permission is to attach the AmazonEC2SpotFleetTaggingRole. Otherwise the benefits of TagOnCreate` does not apply on SpotFleet instances.

  • QDS uses TagOnCreate to tag volumes while bringing up instances. This is currently supported only for OnDemand instances.

  • QDS reduces further CreateTag API calls by bulk tagging resources such as instances and volumes. With bulk tagging, individual worker nodes do not get different tags. For example, When this optimization is enabled, worker nodes would not be tagged individually with numbers such as node0001 and node0002. Instead, all worker nodes would have the same tag which is the same as the cluster name (qbol_acc<account_id>_cl<cluster_id>). As the worker nodes do not have different tags with these optimizations, the AWS S3 path to synchronize logs contains the EC2 instance ID instead of the node numbers. If you have a dependency on the log location in S3, then with this optimization, dependency on the log location can break.

    As this is a change that can alter S3 location dependencies, this feature is not available by default. Create a ticket with Qubole Support to get this feature enabled on the QDS account.

  • QDS further reduces API calls by not tagging inessential resources such as spot requests but this feature enhancement is not available by default. Create a ticket with Qubole Support to get it enabled on the QDS account.