Hive
Bug Fixes
HIVE-1969: The MapJoin/SkewJoin issue due to which queries took a longer time than expected.
HIVE-2338: The Null Pointer exception did not give a descriptive message for the query failures that involved data writes at a base bucket location.
As a resolution, QDS throws a descriptive illegal argument exception instead of the Null Pointer exception for such query failures.
HIVE-2707: Counters for FAILED queries were null.
In addition to the counters printed for the SUCCESSFUL queries, counters for the FAILED queries are printed now.
HIVE-2865: In accounts in which Hive Authorization is enabled, QDS adds the configuration parameter
hive.security.authorization.enabled
to Hive’s Restricted List to prevent users from bypassing Hive Authorization when they run a query. You can change the setting at the cluster level in the cluster’s Hive Settings > Override Hive Configuration field under the Advanced Configuration tab. To enable Hive Authorization in a QDS account, contact Qubole Support. It is now supported in Hive 2.1 (in addition to Hive 1.2).HIVE-2907: The logical query optimization phase is being slow while getting predicates using HiveRelMdPredicates if there are many equivalent columns and CBO is enabled.
HIVE-3179: A memory leak issue with the UDFClassLoader and ClassLoaderResolverImpl objects on HiveServer2.
HIVE-3185: Hive commands used Tez as the execution engine even when MapReduce was configured.
As a resolution, the Hive execution engine that is added as the cluster override will take precedence over the Hive execution engine set in the account settings.
HIVE-3197: To avoid getting
StackOverFlowError
when there are huge partitions to drop, a new configuration parameter,hive.metastore.drop.partitions.batch.size
has been introduced to drop partitions in batches.A user has to pass the batch size to
hive.metastore.drop.partitions.batch.size
(at the cluster/query level or in a Hive bootstrap) to drop the partitions in batches. The default value for this parameter is set to0
, so this parameter does not have any effect unless a value is specified.HIVE-3269: Enabling
hive.optimize.skewjoin
resulted in the job’s failure with the FNFException.HIVE-3271: Failure in Hive vectorization. Handling the NullPointerException in VectorUDFWeekOfYearString.
HIVE-3298: The Tez query failing with the
No work found for tablescan error
when the dynamic partition pruning is enabled.HIVE-3319: All hive.cli.* parameters have been added to the list of whitelisted parameters. You can configure these parameters at runtime and it is not required to add these parameters to
hive.security.authorization.sqlstd.confwhitelist
when Hive Authorization and HiveServer2 are enabled.HIVE-3402: The
ClassNotFoundException
due to the Kryo’s classloader that is set only once during the initialization.HIVE-3484: QDS disallows the
hive.on.master
configuration in Hadoop Overrides.QTEZ-313: The deadlock in ApplicationMaster is resolved by removing the calls from the task attempt to the task. The task passes the location hint and task spec to the TaskAttempt constructor.
QTEZ-315: The Hive query with UNION ALL failed when Tez is set as the execution engine.
QTEZ-330: Parallel Hive queries on Hive 2.1.1, TEZ, and Hive-on-coordinator on a non-HiveServer2 cluster failed intermittently.
To resolve this, Hive supports parallel INSERT INTO values from the same session in the Hive version 2.1. The Hive session ID will be generated randomly for each query, which will avoid race conditions in the session directories.
Enhancements
HIVE-2515: The HS2 health status is available through the Datadog monitoring service. Beta, Via Support
HIVE-2584: Qubole encrypts the Hive metastore passwords. Beta, Via Support
HIVE-3174: Complex expressions are supported in OUTER JOINs by extending column pruner to account for residual filter expression in the JOIN operator.
HIVE-3193: In a SELECT query, Hive checks and waits until the files written to the S3 location are visible to consider the S3 eventual consistency. Disabled
HIVE-3220: The Hive 2.1.1 version can now support multiline comments within the query expressions.
HIVE-3275: A Datadog dashboard for Hive Metastore Server (HMS) is added for Hive, Spark, and Presto clusters. An alert on the HMS Memory usage is also added. Beta, Via Support
HIVE-3276: Liveness and Health Checks for the Hive Metastore Server (HMS) are added in Datadog as follows: Beta, Via Support
Liveness: Alert if HMS process is not available.
Health Check: Run a sample command and check if the services are responding within a given timeout/SLA. Otherwise, create an alert.
HIVE-3347: The parquet file format is added with
hive.default.fileformat
.HIVE-3417: The metastore consistency check (MSCK) result is displayed only in Logs instead of the Results tab of the Analyze UI when the configuration parameter,
hive.qubole.write.msck.result.to.log
is enabled at the query/cluster level or in a Hive bootstrap. Cluster Restart Required ⎼ for the cluster-level setting.HIVE-3434: The AvroSerDe’s InstanceCache is now thread safe. It avoids NullPointerException when the InstanceCache is accessed by multiple threads simultaneously.
QTEZ-217: When Tez is the execution engine in Hive queries, QDS provides an account-level configuration to limit the number of AWS API calls. Beta, Via Support
QTEZ-244: QDS has added the Datadog metrics for the Application History Server. Beta, Via Support