Hadoop

The new features and key enhancements are:

Other enhancements and bug fixes are listed in:

Customizing User-Agent Header in HTTP Requests

HADTWO-2235: Qubole has added support for customizing user-agent header sent in HTTP requests by s3a filesystem. Refer HADOOP-13122 for more details.

Accessing GCP and OCI Cloud Object Storage Objects

HADTWO-2233: You can now access objects in the Google Cloud Object Storage and Oracle Cloud Infrastructure Object Storage through Hadoop in AWS clusters as well. Ensure that Java8 is enabled on the cluster/account.

Enhancements

  • HADTWO-2196: Qubole has backported YARN-3933 to fix race condition in FairScheduler to avoid negative capacity.
  • HADTWO-2252: Qubole has backported HADOOP-14255 that optimizes in fake directory handling while creating new directories.
  • HADTWO-2273: Shell scripts uploaded to default location (DefLoc)/qubole_shell_scripts are deleted after the command completes.

Bug Fixes

  • HADTWO-1047: Worker node picks the value of fs.s3a.buffer.dir from the node where the job starts. So if the job starts on a node which is EBS-only type and worker node only contains instance store, then the worker node throws an exception at LocalDirAllocator class as it does not contain any volume specified in the buffer directory. As in EBS-only node type, EBS is symmetrically linked to ephemeral, Qubole has appended the ephemeral path to the value of the buffer directory. In addition, Qubole has changed the value of fs.s3n.cache.dir to ephemeral0 path for EBS-only or NVMe-only volumes-containing nodes.

  • HADTWO-1751: Fixed /cluster/nodes API to accept GRACEFUL_DECOMMISSIONING as a query parameter and return correct results.

  • HADTWO-2105: Whenever a folder is deleted using the s3a filesystem, it now also deletes special files ending with $folder$ that NativeS3FileSystem creates. Earlier, when such special files were not deleted, issues occurred in Hive while renaming a partition.

  • HADTWO-2191: Fixed an issue when ResourceManager can be in deadlock while shutting down.

  • HADTWO-2253: Resolves the issue in which the LDAP configuration with HDFS failed due to issues with the credential provider factory.

  • HADTWO-2277: Fixed ConcurrentModification Exception in the FSParentQueue class.

  • HADTWO-2322: Tez queries using hadoop-commons utils of guava.toStringHelper failed as the helper method is deprecated and removed from future versions of guava.

    To resolve this issue, Qubole has removed the deprecated guava method and instead replaced it with a stable method (StringBuilder).

    Refer to the OSS HADOOP-14891.

For a list of bug fixes between versions R57 and R58, see Changelog for api.qubole.com.