Hive

Bug Fixes

  • HIVE-1969: A MapJoin/SkewJoin issue that caused queries to take longer than expected.
  • HIVE-3179: A memory leak issue with the UDFClassLoader and ClassLoaderResolverImpl objects on HiveServer2.
  • HIVE-3185: Hive commands used Tez as the execution engine even when MapReduce was configured. To resolve this, the Hive execution engine that is added as the cluster override will take precedence over the Hive execution engine set in the account settings.
  • HIVE-3197: To prevent a StackOverFlowError when there are very many or very large partitions to drop, a new configuration parameter, hive.metastore.drop.partitions.batch.size has been introduced to drop partitions in batches.You pass the batch size to hive.metastore.drop.partitions.batch.size (at the cluster or query level or in the Hive bootstrap script). The default value for this parameter is zero; there is no effect unless you specify a value greater than zero.
  • HIVE-3269: Enabling hive.optimize.skewjoin resulted in the job failing with FNFException.
  • HIVE-3271: Failure in Hive vectorization, handling the NullPointerException in VectorUDFWeekOfYearString.
  • HIVE-3298: Tez queries were failing with a No work found for tablescan error when dynamic partition pruning was enabled.
  • HIVE-3319: All hive.cli.* parameters have been added to the list of whitelisted parameters. You can configure these parameters at runtime; you do not need to add them to hive.security.authorization.sqlstd.confwhitelist when Hive Authorization and HiveServer2 are enabled.
  • HIVE-3402: A ClassNotFoundException was due to Kryo’s classloader being set only once during the initialization.
  • HIVE-3484: QDS no longer supports setting hive.on.master in Hadoop overrides (the Override Hadoop Configuration Variables field in the Clusters configuration dialog in the QDS UI).
  • QTEZ-313: Deadlock in ApplicationMaster. Fixed by removing the calls from the task attempt to the task; the task now passes the location hint and task specification to the TaskAttempt constructor.
  • QTEZ-315: A Hive query with UNION ALL failed when Tez was set as the execution engine.
  • QTEZ-330: Parallel Hive queries on Hive 2.1.1, TEZ and Hive-on-Master on a non-HiveServer2 cluster failed intermittently. To resolve this, Hive supports parallel INSERT INTO values from the same session in Hive version 2.1. The Hive session ID is generated randomly for each query, preventing race conditions in the session directories.

Enhancements

  • HIVE-3174: Hive now supports complex expressions in OUTER JOINs by extending the column pruner to account for residual filter expression in the JOIN operator.
  • HIVE-3220: Hive 2.1.1 now supports multi-line comments in query expressions.
  • HIVE-3347: hive.default.fileformat now accepts the Parquet format among the possible values.
  • HIVE-3417: The metastore consistency check (MSCK) result is displayed only under Logs instead of the Results tab of the Analyze page of the QDS UI when hive.qubole.write.msck.result.to.log is enabled at the query or cluster level or in a Hive bootstrap file. Cluster Restart Required ⎼ for the cluster-level setting.
  • HIVE-3434: AvroSerDe’s InstanceCache is now thread-safe. There is no longer a NullPointerException when the InstanceCache is accessed by multiple threads simultaneously.