Known Issues

The known issues with this version are:

s3cmd Command Failures

s3cmd is a tool for managing objects in Amazon S3 storage. As part of R57, s3cmd package has been upgraded from version 1.5 to version 2.0.2.

After upgrading to QDS version R57, you may observe s3cmd command failures in QDS commands or Airflow jobs. s3cmd fails with return error code 1. If you are calling s3cmd with -d option, you will also see the 403 Forbidden error message.

It is a known issue.

Current Scenario

During internal testing, Qubole has discovered an issue where cp and mv commands in s3cmd version 2.0.2 fail in a very specific scenario.

cp and mv s3cmd commands fail with exit code 1 when you try to copy or move an object that you own in another AWS account. This happens even when you have read/write access on the object/bucket. Note that objects do get copied in this scenario but the response code returned is 1, which indicates that it is a failure.

Solution

If you observe s3cmd command failures after upgrading to R57, you can prevent these failures by reverting to the older version of s3cmd by adding the following lines in the cluster’s node bootstrap file. Cluster Restart Required

pip uninstall s3cmd
pip install s3cmd==1.5.2

Note

You need to restart the cluster for the s3cmd version change to be effective.

Boto Client Error

Boto is a Python package that provides interfaces to Amazon Web Services. AWS has deprecated the V2 signature usage for new AWS regions created after January 2014. AWS will allow any new S3 bucket created post June 24, 2020 to only use V4 signature.

Qubole has made changes in R57 to use V4 signature for S3 client calls through boto by default through the /etc/boto.cfg file. So, when connecting to S3 in commands or in notebooks after clusters are started with QDS version R57, you may hit the BotoClientError: When using SigV4, you must specify a ‘host’ parameter error.

Solution

Qubole recommends you to include the host=s3.amazonaws.com parameter in Boto S3 connect calls. Until you add the host parameter, you can prevent the Boto client error by running the following command through the node bootstrap.

Note

You must restart the cluster for the following command to be effective. Cluster Restart Required

rm /etc/boto.cfg