Using Hive ACID Tables

Qubole Hive supports different operations that you can use on Hive ACID tables, which are described in the following list:

Note

For information on how to perform compactions to aggregate delta directories that get created as a result of ACID operations, see Compaction of Hive Transaction Delta Directories.

Creating an ACID Table

You can create a full ACID table and an INSERT-only table.

Creating a Full ACID Table

To create a full ACID table set transactional to true in the table properties. You can use the following command to create a full ACID table. Only the ORC file format is supported with Full ACID table.

CREATE TABLE acidtbl (key int, value string) STORED AS ORC TBLPROPERTIES ("transactional"="true");

Creating an INSERT-only Table

To create an Insert-only ACID table, you must additionally specify the transactional property as insert_only in the table properties as shown in this example. All file formats are supported with the Insert-only table.

CREATE TABLE acidtbl_insert_only (key int, value string) STORED AS TEXTFILE TBLPROPERTIES ("transactional"="true", "transactional_properties"="insert_only");

Converting a Non-ACID table to an ACID Table

You can only convert a Non-ACID managed table into a ACID table. To convert an EXTERNAL Hive table into a Hive ACID table, you must convert it into a Non-ACID managed table by running this command.

ALTER TABLE T set tblproperties('EXTERNAL'='FALSE')

Converting a Non-ACID Managed Table to an ACID Table

You can convert a non-ACID Hive table to a full ACID table only when the non-ACID table data is in ORC format. If the non-ACID table is not in the ORC file format, then only Insert-only table conversion is supported. Unlike open-source Hive, Qubole Hive 3.1.1 (beta) does not have the restriction on the file names in the source table to strictly comply with the patterns that Hive uses to write the data.

However, when the table data is in the ORC file format, then you can convert it into a full ACID table or an Insert-only table.

To convert a Non-ACID table (only with table data in the ORC file format) to a full ACID table, use this command.

ALTER TABLE nonacidtbl SET TBLPROPERTIES ('transactional'='true');

To convert a Non-ACID table to Insert-only table, use this command.

ALTER TABLE nonacidtbl SET TBLPROPERTIES ('transactional'='true', 'transactional_properties'='insert_only');

Writing into an ACID Table

You can insert data into a full ACID or an Insert-only table just like inserting data into any other table in Hive. For example,

INSERT INTO acidtbl VALUES(1,'a');
INSERT INTO acidtbl VALUES(2,'b');

Updating Data in an ACID Table

Update allows you to update all rows in a table or a set of rows based on a condition.

To update all rows of the table, use this command.

UPDATE acidtbl SET value='updated';

To update some rows in table, use this command (as an example).

UPDATE acidtbl SET value='updated' WHERE key=1;

Deleting Data from an ACID Table

Delete allows you to delete all rows or some set of rows based on the condition.

To delete all rows from a table, use this command.

DELETE FROM acidtbl;

To delete some set of row from a table, use this command (as an example).

DELETE FROM acidtbl WHERE key=1;

Performing Merge Operation in an ACID Table

Merge allows you to combine Insert, Update, and potentially Delete statements into a single operation. Let us use this an example.

MERGE INTO acidtbl USING src
      ON acidtbl.key = src.key
      WHEN MATCHED AND src.value IS NULL THEN DELETE
      WHEN MATCHED AND (acidtbl.value != src.value) THEN UPDATE SET value = src.value
      WHEN NOT MATCHED THEN INSERT VALUES (src.key, src.value);