Canonical Hive Commands Report

GET /api/v1.2/reports/canonical/_hive/_commands/

This API provides you the canonical Hive commands report in JSON format. Currently, this report is not generated and Qubole intends to provide this report very soon.

Note

  • If the difference between the start and end dates is greater than 60 days, the system defaults to a 1-month window from the current date.
  • If either the start date or end date is not provided, then the system defaults to a 1-month window from the current date.
  • To get data for a window that is greater than 60 days, create a ticket with Qubole Support.

Required Role

The following users can make this API call:

  • Users who belong to the system-user or system-admin group.
  • Users who belong to a group associated with a role that allows viewing canonical Hive commands reports. See Managing Groups and Managing Roles for more information.

Parameters

Parameter Description
start_date The date from which you want the report (inclusive). This parameter supports the timestamp in the UTC timezone (YYYY-MM-DDTHH:MM:SSZ) format. The date cannot be earlier than 90 days.
end_date The date until which you want the report. The report contains data from this date also. The API default is today or now. This parameter also supports timestamp in the UTC timezone (YYYY-MM-DDTHH:MM:SSZ) format.
offset The starting point of the results. The API default is 0.
limit The number of results to fetch. The API default is 10.
sort_column The column used to sort the report. Since this report returns the top canonical_hive_commands, the sort order is always descending. Valid choices are ‘frequency’, ‘cpu’, ‘fs_bytes_read’, and ‘fs_bytes_written’. The API default is, frequency.
show_ast Also return the serialized AST corresponding to the canonical query. (Not returned by default.)

Response Parameters

Parameter Description
start_date The actual starting date of the report.
end_date The actual ending date of the report.
sort_column The sort column used.

An array of:

canonical_query_id The ID of the canonical query.
canonical_query The AST dump of the canonical query. (This is returned only when the show_ast parameter is passed.)
recent_example The most recent example of this type of queries.
frequency The number of queries of this type.
cpu The total cumulative CPU, (in ms), consumed by these queries.
fs_bytes_read The total bytes read by these queries.
fs_bytes_written The total bytes written by these queries.

Examples

Without any parameters

curl -i -X GET -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Accept: application/json" \
"https://api.qubole.com/api/v1.2/reports/canonical_hive_commands"

Note

The above syntax uses https://api.qubole.com as the endpoint. Qubole provides other endpoints to access QDS that are described in Supported Qubole Endpoints on Different Cloud Providers.

Sample Response

{
  "sort_column": "frequency",
  "canonical_queries": [
    {
      "canonical_query_id": "af09cd5799e52f450a87e236f453b864833afac97603409a17f3df4d010b1814",
      "fs_bytes_written": 0,
      "fs_bytes_read": 0,
      "frequency": 8800,
      "cpu": 0,
      "recent_example": "alter table demo_memetracker recover partitions"
    },
    {
      "canonical_query_id": "9548ac7ec7defe3c3251da2544ec545c9bd578cb308c6c3c1936e48df0bdfdb4",
      "fs_bytes_written": 0,
      "fs_bytes_read": 0,
      "frequency": 4547,
      "cpu": 0,
      "recent_example": "alter table daily_tick_data recover partitions"
    },
    {
      "canonical_query_id": "69dae07fc876927b9daba6279c962bc343131c08d8f9f98adfa0c05ef90b40a4",
      "fs_bytes_written": 0,
      "fs_bytes_read": 0,
      "frequency": 768,
      "cpu": 0,
      "recent_example": "show tables"
    },
    {
      "canonical_query_id": "89720fe23d2a85ac217a3b230e992c45dd523b65e6d45863cc410f4b5e4795ea",
      "fs_bytes_written": 0,
      "fs_bytes_read": 0,
      "frequency": 53,
      "cpu": 0,
      "recent_example": "select * from `default`.`memetracker` limit 400"
    },
    {
      "canonical_query_id": "04bccd848172842c8fadd687aef72ac2161f72895dfd3c1d3c31a96411d34095",
      "fs_bytes_written": 0,
      "fs_bytes_read": 0,
      "frequency": 49,
      "cpu": 0,
      "recent_example": "select * from `default`.`30days_test` limit 1 "
    },
    {
      "canonical_query_id": "9996d665cf077f60b1ee87d3b5b80cd65ce078f77935096441433628909b9ddb",
      "fs_bytes_written": 9,
      "fs_bytes_read": 28482500000,
      "frequency": 48,
      "cpu": 0,
      "recent_example": "select count(*) from memetracker"
    },
    {
      "canonical_query_id": "492ea35ff3d58d0d07e70bcc68ea33eadb7bb572f6fdf14f7220931cc94b1abc",
      "fs_bytes_written": 0,
      "fs_bytes_read": 0,
      "frequency": 37,
      "cpu": 0,
      "recent_example": "select * from `default`.`default_qubole_airline_origin_destination` limit 1000"
    },
    {
      "canonical_query_id": "5e2bb3326c4cf2c5b669d60729c5697fb6fdef4f1e09be41e37f424eb96b0c74",
      "fs_bytes_written": 0,
      "fs_bytes_read": 0,
      "frequency": 30,
      "cpu": 0,
      "recent_example": "select * from default_qubole_memetracker limit 10;"
    },
    {
      "canonical_query_id": "aa1c7a294f6e18feec68175a643814f06e180f4ac5e62eb6b556d9bf72830bc2",
      "fs_bytes_written": 25326,
      "fs_bytes_read": 85944,
      "frequency": 22,
      "cpu": 0,
      "recent_example": "select * from test_csv limit 5;"
    },
    {
      "canonical_query_id": "2145af0ee70e1cd93c9901cd41dee8285faacaefe86e4a4f22880316cc4e63c3",
      "fs_bytes_written": 21050200,
      "fs_bytes_read": 321138000,
      "frequency": 21,
      "cpu": 0,
      "recent_example": "select * from demo_memetracker limit 100"
    }
  ]
}

With a different sort column and limit and show_ast=true

curl -i -X GET -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Accept: application/json" \ "https://api.qubole.com/api/v1.2/reports/canonical_hive_commands?sort_column=cpu&limit=2&show_ast=true"

Note

The above syntax uses https://api.qubole.com as the endpoint. Qubole provides other endpoints to access QDS that are described in Supported Qubole Endpoints on Different Cloud Providers.

Sample Response

{
 "sort_column": "cpu",
 "canonical_queries": [
    {
      "canonical_query_id": "d9635e3ad5501c9ad47bb728c35b63e1b41f8c2ba0fb4f7533b9ab701ce503c4",
      "canonical_query": "(null(TOK_QUERY(TOK_FROM(TOK_TABREF(TOK_TABNAME(lineitem))))(TOK_INSERT(TOK_DESTINATION(TOK_DIR(TOK_TMP_FILE)))(TOK_SELECT(TOK_SELEXPR(TOK_FUNCTIONSTAR(count))))))())",
      "fs_bytes_written": 18,
      "fs_bytes_read": 7726360000,
      "frequency": 2,
      "cpu": 423130,
      "recent_example": "set fs.s3.inputpathprocessor=false;\nselect count(*) from lineitem;"
    },
    {
      "canonical_query_id": "ea1a37cf6b4694293d15cadfaf4bbae2459f12475cd86ea90e6c4f8e31945bda",
      "canonical_query": "(null(TOK_QUERY(TOK_FROM(TOK_TABREF(TOK_TABNAME(default_qubole_memetracker))))(TOK_INSERT(TOK_DESTINATION(TOK_DIR(TOK_TMP_FILE)))(TOK_SELECT(TOK_SELEXPR(TOK_FUNCTIONSTAR(count))))(TOK_WHERE(=(TOK_TABLE_OR_COL(month))(LITERAL)))))())",
      "fs_bytes_written": 108,
      "fs_bytes_read": 54673600000,
      "frequency": 20,
      "cpu": 416200,
      "recent_example": "SELECT count(*) FROM default_qubole_memetracker where month = '2008-08';"
    }
 ]
}

For a specific time period

curl -i -X GET -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Accept: application/json" \ "https://api.qubole.com/api/v1.2/reports/canonical_hive_commands?start_date=2014-04-01&end_date=2014-04-21&sort_column=fs_bytes_read&limit=2"

Note

The above syntax uses https://api.qubole.com as the endpoint. Qubole provides other endpoints to access QDS that are described in Supported Qubole Endpoints on Different Cloud Providers.

Sample Response

{
  "canonical_queries": [
     {
       "canonical_query_id": "55ebb0cc47e0dc74c70245b026126bba191969dee1dc380a6f98698e6b194085",
       "cpu": 75720,
       "frequency": 1,
       "recent_example": "select dt, count(*) from  junk_temp \ngroup by dt order by dt\n\n\n",
       "fs_bytes_read": 308582016,
       "fs_bytes_written": 1514
     },
     {
       "canonical_query_id": "1c421d2e65407650650cbc2ee80f9a59863875f52d1a9ddd5a051118678a3a6c",
       "cpu": 34980,
       "frequency": 1,
       "recent_example": "select created_at from  junk_temp \nwhere dt=2014-01-03 limit 10\n\n",
       "fs_bytes_read": 308582016,
       "fs_bytes_written": 0
     }
  ],
   "start_date": "2014-03-31T10:00:00Z",
   "end_date": "2014-04-21T20:00:00Z",
   "sort_column": "fs_bytes_read"
}

To learn more about canonicalization of hive queries, see the Blog post.

Caution

Qubole started collecting the CPU metrics only from the last week of December, 2013. So, if you have some queries before that, the CPU metrics is considered as 0.