Using Hive ACID Tables¶
Qubole Hive supports different operations that you can use on Hive ACID tables, which are described in the following list:
- Creating an ACID Table
- Converting a Non-ACID table to an ACID Table
- Writing into an ACID Table
- Updating Data in an ACID Table
- Deleting Data from an ACID Table
- Performing Merge Operation in an ACID Table
For information on how to perform compactions to aggregate delta directories that get created as a result of ACID operations, see Compaction of Hive Transaction Delta Directories.
Creating an ACID Table¶
You can create a full ACID table and an INSERT-only table.
Creating a Full ACID Table¶
To create a full ACID table set
true in the table properties. You can use the following command to
create a full ACID table. Only the ORC file format is supported with Full ACID table.
CREATE TABLE acidtbl (key int, value string) STORED AS ORC TBLPROPERTIES ("transactional"="true");
Creating an INSERT-only Table¶
To create an Insert-only ACID table, you must additionally specify the transactional property as insert_only in the table properties as shown in this example. All file formats are supported with the Insert-only table.
CREATE TABLE acidtbl_insert_only (key int, value string) STORED AS TEXTFILE TBLPROPERTIES ("transactional"="true", "transactional_properties"="insert_only");
Converting a Non-ACID table to an ACID Table¶
You can only convert a Non-ACID managed table into a ACID table. To convert an EXTERNAL Hive table into a Hive ACID table, you must convert it into a Non-ACID managed table by running this command.
ALTER TABLE T set tblproperties('EXTERNAL'='FALSE')
Converting a Non-ACID Managed Table to an ACID Table¶
You can convert a non-ACID Hive table to a full ACID table only when the non-ACID table data is in ORC format. If the non-ACID table is not in the ORC file format, then only Insert-only table conversion is supported. Unlike open-source Hive, Qubole Hive 3.1.1 (beta) does not have the restriction on the file names in the source table to strictly comply with the patterns that Hive uses to write the data.
However, when the table data is in the ORC file format, then you can convert it into a full ACID table or an Insert-only table.
To convert a Non-ACID table (only with table data in the ORC file format) to a full ACID table, use this command.
ALTER TABLE nonacidtbl SET TBLPROPERTIES ('transactional'='true');
To convert a Non-ACID table to Insert-only table, use this command.
ALTER TABLE nonacidtbl SET TBLPROPERTIES ('transactional'='true', 'transactional_properties'='insert_only');
Writing into an ACID Table¶
You can insert data into a full ACID or an Insert-only table just like inserting data into any other table in Hive. For example,
INSERT INTO acidtbl VALUES(1,'a'); INSERT INTO acidtbl VALUES(2,'b');
Updating Data in an ACID Table¶
Update allows you to update all rows in a table or a set of rows based on a condition.
To update all rows of the table, use this command.
UPDATE acidtbl SET value='updated';
To update some rows in table, use this command (as an example).
UPDATE acidtbl SET value='updated' WHERE key=1;
Deleting Data from an ACID Table¶
Delete allows you to delete all rows or some set of rows based on the condition.
To delete all rows from a table, use this command.
DELETE FROM acidtbl;
To delete some set of row from a table, use this command (as an example).
DELETE FROM acidtbl WHERE key=1;
Performing Merge Operation in an ACID Table¶
Merge allows you to combine
Update, and potentially
Delete statements into a single operation.
Let us use this an example.
MERGE INTO acidtbl USING src ON acidtbl.key = src.key WHEN MATCHED AND src.value IS NULL THEN DELETE WHEN MATCHED AND (acidtbl.value != src.value) THEN UPDATE SET value = src.value WHEN NOT MATCHED THEN INSERT VALUES (src.key, src.value);