Hive

The section describes new features and enhancements along with bug fixes.

Hive 2.3 (beta) Version

QHIVE-3438: Hive 2.3 (beta) is available. When Hive 2.3 (beta) runs on the QDS server, it uses Java 8. Hive 2.3 (beta) is compatible with Java 7 as well when it is run on master or HiveServer 2. Beta

You can now configure 2.3 (beta) as a Hive version while creating a cluster. You can configure it from the Configuration tab of the Clusters UI while creating a cluster and through the cluster API as well. Cluster Restart Required

For more information, see Understanding Hive Versions.

Multi-line Column Data in Hive Query Results

QHIVE-2650: On the UI, the query results that had columns with multiple line data did not display correctly. To overcome this, Qubole supports newline (\n) and carriage return (\r) characters in Hive query results by escaping them in the Hive result set and then un-escaping in the UI. You can get this feature enabled by contacting Qubole Support. Via Support

After this feature is enabled, even a simple SELECT query requires a cluster start.

Hive Logs are Available in the Analyze UI

QHIVE-3367: A detailed log for a specific Hive query that is executed using HiveServer2 or Hive-on-master are uploaded to a subdirectory in the default location on the cloud object storage within a couple of minutes of query completion and the location of the logs is visible in the Logs tab of the Analyze UI page. Individual log files are created for each query at /media/ephemeral0/hive_query_logs along with the existing logs. Via Support

Hadoop 2 (Hive) Clusters support Pig Version 0.17

ACM-3714: While creating a Hadoop 2 (Hive) cluster, you can configure the Pig 0.17 version through the Clusters UI while creating a cluster Cluster Restart Required and you can also configure it through the cluster API. Beta

You can also choose between MapReduce and Tez as the execution engine when you set the Pig 0.17 (beta) version. Pig 0.17 (beta) is only supported with Hive 1.2.0.

Enhancements

  • QHIVE-3675: While processing a FileSplit, if FileNotFoundException is encountered due to s3 listing inconsistency, retry for the configured hive.qubole.handle.s3.stale.listing.retries times with 1 second gap (default value of retries is 10). If the error persists, ignore and move to the next split. The S3 stale listing feature is not available by default. Via Support

Bug Fixes

  • QHIVE-1723: As part of the fix to drop tables with partitions failures, the default timeout for a drop table operation is increased to 30 minutes. The timeout for only the DROP table operation is now configurable using the hive.qubole.drop.table.metastore.client.socket.timeout parameter as a cluster level override. Cluster Restart Required To enable it at the account level, contact Qubole Support. The account-level configuration is Via Support.
  • QHIVE-2079: Fixed a potential race condition causing FileNotFoundException when multiple INSERT queries with Dynamic-Partitioning enabled were run in parallel.
  • QHIVE-3528: The issue where the Hive table is getting corrupted due to the total number of buckets in a partition is greater than the expected count is resolved. This issue occurred when the dynamic partition prefix is enabled on the account. Hive will now create empty buckets during an INSERT OVERWRITE operation only if required while loading table or partition to match the expected number of buckets defined for a table. Via Support
  • QHIVE-3560: Fixed the race issue of multiple commands trying to create the downloaded resources directory with the same name.
  • QHIVE-3600: It is a fix for query failing with FileNotFoundException when hive.optimize.skewjoin is enabled and hive.auto.convert.join is disabled.
  • QHIVE-3753: Hive will print a warning on the console logs and fail the query when a Presto view is used in a Hive command instead of failing the query with NullPointerException.
  • QHIVE-3828: Fix for NullPointerException while dropping a permanent function in a different session.
  • QHIVE-3875: Queries failing with FileNotFoundException when finding the FileSystem timestamp (due to S3 eventual consistency) will now retry getting the file for the configured hive.qubole.handle.s3.stale.listing.retries number of times (default value of retries is 10).
  • QHIVE-3916: The limitation of 2000 characters for the metastore column type name has been fixed in Hive 2.1.1 version.

For a list of bug fixes between versions R54 and R55, see Changelog for api.qubole.com.