Hive
Hive 2.1 is Generally Available
Hive 2.1 version is now generally available. Cluster Restart Required
HA Proxy in the Cluster Coordinator Node
INFRA-1016 and INFRA-603: Qubole has added HA Proxy in the cluster coordinator node for balancing the load when there are multiple connections between the cluster and the Qubole-managed Hive Metastore. This removes a single point of failure and provides more stability. Via Support
If you are using the bastion node, you must allow ports, from 20001 to 20005 in addition to the port 7000 for the incoming TCP traffic from the cluster’s coordinator node.
Enhancements
QHIVE-1633: The logs for Hive Tez jobs display the split computation, progress state, and the completion state.
QHIVE-3727: The default value of
hive.metastore.drop.partitions.batch.size
has been set to 1000 to drop the partitions in 1000 batches. You can configure this parameter based on the number of partitions that you want to drop.EAM-1334: Qubole supports custom Hive metastore to access it through QDS.
Bug Fixes
QHIVE-3508: It is recommended to configure
hive.groupby.skewindata.use.rand.with.seed
if you are configuringhive.groupby.skewindata
to avoid data inconsistency when map tasks are reattempted.QHIVE-3582: The issue in which pruning columns resulted in incorrect sequence/order of columns in the
SelectOperator
, has been resolved.QHIVE-3641: The container memory for reduce tasks is decided based on the
mapreduce.reduce.memory.mb
configuration instead of themapreduce.map.memory.mb
when Tez is the execution engine. It resolves the issue in which overriding the default map/reduce tasks at the Hive query level was unsuccessful.