The known issues with this version are:
s3cmd Command Failures¶
s3cmd is a tool for managing objects in Amazon S3 storage. As part of R57,
s3cmd package has been upgraded from version
1.5 to version 2.0.2.
After upgrading to QDS version R57, you may observe
s3cmd command failures in QDS commands or Airflow jobs.
fails with return error code 1. If you are calling
s3cmd with -d option, you will also see the
403 Forbidden error message.
It is a known issue.
During internal testing, Qubole has discovered an issue where
mv commands in
s3cmd version 2.0.2 fail
in a very specific scenario.
mv s3cmd commands fail with exit code 1 when you try to copy or move an object that you own in
another AWS account. This happens even when you have read/write access on the object/bucket. Note that
objects do get copied in this scenario but the response code returned is 1, which indicates that it is a failure.
If you observe
s3cmd command failures after upgrading to R57, you can prevent these failures by reverting to the
older version of
s3cmd by adding the following lines in the cluster’s node bootstrap file. Cluster Restart Required
pip uninstall s3cmd pip install s3cmd==1.5.2
You need to restart the cluster for the s3cmd version change to be effective.
Boto Client Error¶
Boto is a Python package that provides interfaces to Amazon Web Services. AWS has deprecated the V2 signature usage for new AWS regions created after January 2014. AWS will allow any new S3 bucket created post June 24, 2020 to only use V4 signature.
Qubole has made changes in R57 to use V4 signature for S3 client calls through boto by default through the
file. So, when using Boto with V4 signature, the
host parameter is required. If you had not provided the host parameter, the
BotoClientError: When using SigV4, you must specify a ‘host’ parameter error appears. So, when connecting to S3 in
commands or in notebooks, you may hit this error.
(That is, the error is not an inevitable consequence of using V4 signature. It only occurs when the host parameter is not provided.)
Qubole recommends you to include the
host=s3.amazonaws.com parameter in Boto S3 connect calls. Until you add the
host parameter, you can prevent the Boto client error by running the following command through the node bootstrap.
You must restart the cluster for the following command to be effective. Cluster Restart Required
boto.cfg file results in the client using the V2 signature. It may cause failures if the cluster software must
communicate with S3 buckets, which only support the V4 signature.