Composing a Query Export (AWS)ΒΆ

A query export is a combination of a Hive query followed by a data export command. See Composing a Hive Query, Composing a Data Export Command through the UI, and Query Export for more information.

Note

Hadoop 2 (Hive) clusters support Hive queries. See Mapping of Cluster and Command Types for more information.

You can configure the Pig version on an Hadoop 2 (Hive) cluster. Pig 0.11 is the default version. Pig 0.15 and Pig 0.17 (beta) are the other supported versions. You can also choose between MapReduce and Tez as the execution engine when you set the Pig 0.17 (beta) version. Pig 0.17 (beta) is only supported with Hive 1.2.0.

Perform the following steps to compose a Hive table data export command.

Note

Using the Supported Keyboard Shortcuts in Analyze describes the supported keyboard shortcuts.

  1. Navigate to the Analyze page and click Compose. Select Query Export from the Command Type drop-down list.
  2. Compose a Hive query, or specify a path to a saved query. See Composing a Hive Query for more information about composing and running Hive queries.
  3. Choose HiveTable Export (the default) or Directory Export. You can:
    • Choose HiveTable Export to export data from a Hive table; Data is written to a Cloud storage directory which has an existing Hive table pointing to this Cloud storage directory. You can write data to a Hive partition too. Data export picks the Hive table and writes data to the data store table.
    • Choose Directory Export to export data from a Cloud storage directory. Data is written to a Cloud storage directory first using Hive query. After that, data export picks the data from the Cloud storage directory and writes it to the data store table. (See Composing a Directory Export Command for more information).
  4. Specify the HiveTable or the Export Directory.
  5. Specify the Hive partition if any.
  6. Select a Data Store from the drop-down list.
  7. Choose a table from the DbTable drop-down list.
  8. Choose a DB Update Mode. Append Mode is the default. The other two options are Update Only Mode and Insert and Update Mode (supported only for an Oracle MySQL database).
  9. If you want to run the command on a Hadoop cluster, click the Use Hadoop Cluster check box and choose the cluster label from the drop-down list.
  10. Click Run to execute the command. Click Save if you want to re-run the same command later. (See Workspace for more information on saving commands/queries.)

You can see the command result under the Results tab and the command logs under the Logs tab. The Logs tab has the Errors and Warnings filter. For more information on how to download command results and logs, see Downloading Results and Logs.