Canonical Hive Commands Report
- GET /api/v1.2/reports/canonical/_hive/_commands/
This API provides you the canonical Hive commands report in JSON format. Currently, this report is not generated and Qubole intends to provide this report very soon.
Note
If the difference between the start and end dates is greater than 60 days, the system defaults to a 1-month window from the current date.
If either the start date or end date is not provided, then the system defaults to a 1-month window from the current date.
To get data for a window that is greater than 60 days, create a ticket with Qubole Support.
Required Role
The following users can make this API call:
Users who belong to the system-user or system-admin group.
Users who belong to a group associated with a role that allows viewing canonical Hive commands reports. See Managing Groups and Managing Roles for more information.
Parameters
Parameter |
Description |
---|---|
start_date |
The date from which you want the report (inclusive). This parameter supports the timestamp in the UTC timezone (YYYY-MM-DDTHH:MM:SSZ) format. The date cannot be earlier than 90 days. |
end_date |
The date until which you want the report. The report contains data from this date also. The API default is today or now. This parameter also supports timestamp in the UTC timezone (YYYY-MM-DDTHH:MM:SSZ) format. |
offset |
The starting point of the results. The API default is 0. |
limit |
The number of results to fetch. The API default is 10. |
sort_column |
The column used to sort the report. Since this report returns the top canonical_hive_commands, the sort order is always descending. Valid choices are ‘frequency’, ‘cpu’, ‘fs_bytes_read’, and ‘fs_bytes_written’. The API default is, frequency. |
show_ast |
Also return the serialized AST corresponding to the canonical query. (Not returned by default.) |
Response Parameters
Parameter |
Description |
---|---|
start_date |
The actual starting date of the report. |
end_date |
The actual ending date of the report. |
sort_column |
The sort column used. |
An array of:
canonical_query_id |
The ID of the canonical query. |
canonical_query |
The AST dump of the canonical query. (This is returned only when the show_ast parameter is passed.) |
recent_example |
The most recent example of this type of queries. |
frequency |
The number of queries of this type. |
cpu |
The total cumulative CPU, (in ms), consumed by these queries. |
fs_bytes_read |
The total bytes read by these queries. |
fs_bytes_written |
The total bytes written by these queries. |
Examples
Without any parameters
curl -i -X GET -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Accept: application/json" \
"https://api.qubole.com/api/v1.2/reports/canonical_hive_commands"
Note
The above syntax uses https://api.qubole.com as the endpoint. Qubole provides other endpoints to access QDS that are described in Supported Qubole Endpoints on Different Cloud Providers.
Sample Response
{
"sort_column": "frequency",
"canonical_queries": [
{
"canonical_query_id": "af09cd5799e52f450a87e236f453b864833afac97603409a17f3df4d010b1814",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 8800,
"cpu": 0,
"recent_example": "alter table demo_memetracker recover partitions"
},
{
"canonical_query_id": "9548ac7ec7defe3c3251da2544ec545c9bd578cb308c6c3c1936e48df0bdfdb4",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 4547,
"cpu": 0,
"recent_example": "alter table daily_tick_data recover partitions"
},
{
"canonical_query_id": "69dae07fc876927b9daba6279c962bc343131c08d8f9f98adfa0c05ef90b40a4",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 768,
"cpu": 0,
"recent_example": "show tables"
},
{
"canonical_query_id": "89720fe23d2a85ac217a3b230e992c45dd523b65e6d45863cc410f4b5e4795ea",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 53,
"cpu": 0,
"recent_example": "select * from `default`.`memetracker` limit 400"
},
{
"canonical_query_id": "04bccd848172842c8fadd687aef72ac2161f72895dfd3c1d3c31a96411d34095",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 49,
"cpu": 0,
"recent_example": "select * from `default`.`30days_test` limit 1 "
},
{
"canonical_query_id": "9996d665cf077f60b1ee87d3b5b80cd65ce078f77935096441433628909b9ddb",
"fs_bytes_written": 9,
"fs_bytes_read": 28482500000,
"frequency": 48,
"cpu": 0,
"recent_example": "select count(*) from memetracker"
},
{
"canonical_query_id": "492ea35ff3d58d0d07e70bcc68ea33eadb7bb572f6fdf14f7220931cc94b1abc",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 37,
"cpu": 0,
"recent_example": "select * from `default`.`default_qubole_airline_origin_destination` limit 1000"
},
{
"canonical_query_id": "5e2bb3326c4cf2c5b669d60729c5697fb6fdef4f1e09be41e37f424eb96b0c74",
"fs_bytes_written": 0,
"fs_bytes_read": 0,
"frequency": 30,
"cpu": 0,
"recent_example": "select * from default_qubole_memetracker limit 10;"
},
{
"canonical_query_id": "aa1c7a294f6e18feec68175a643814f06e180f4ac5e62eb6b556d9bf72830bc2",
"fs_bytes_written": 25326,
"fs_bytes_read": 85944,
"frequency": 22,
"cpu": 0,
"recent_example": "select * from test_csv limit 5;"
},
{
"canonical_query_id": "2145af0ee70e1cd93c9901cd41dee8285faacaefe86e4a4f22880316cc4e63c3",
"fs_bytes_written": 21050200,
"fs_bytes_read": 321138000,
"frequency": 21,
"cpu": 0,
"recent_example": "select * from demo_memetracker limit 100"
}
]
}
With a different sort column and limit and show_ast=true
curl -i -X GET -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Accept: application/json" "https://api.qubole.com/api/v1.2/reports/canonical_hive_commands?sort_column=cpu&limit=2&show_ast=true"
Note
The above syntax uses https://api.qubole.com as the endpoint. Qubole provides other endpoints to access QDS that are described in Supported Qubole Endpoints on Different Cloud Providers.
Sample Response
{
"sort_column": "cpu",
"canonical_queries": [
{
"canonical_query_id": "d9635e3ad5501c9ad47bb728c35b63e1b41f8c2ba0fb4f7533b9ab701ce503c4",
"canonical_query": "(null(TOK_QUERY(TOK_FROM(TOK_TABREF(TOK_TABNAME(lineitem))))(TOK_INSERT(TOK_DESTINATION(TOK_DIR(TOK_TMP_FILE)))(TOK_SELECT(TOK_SELEXPR(TOK_FUNCTIONSTAR(count))))))())",
"fs_bytes_written": 18,
"fs_bytes_read": 7726360000,
"frequency": 2,
"cpu": 423130,
"recent_example": "set fs.s3.inputpathprocessor=false;\nselect count(*) from lineitem;"
},
{
"canonical_query_id": "ea1a37cf6b4694293d15cadfaf4bbae2459f12475cd86ea90e6c4f8e31945bda",
"canonical_query": "(null(TOK_QUERY(TOK_FROM(TOK_TABREF(TOK_TABNAME(default_qubole_memetracker))))(TOK_INSERT(TOK_DESTINATION(TOK_DIR(TOK_TMP_FILE)))(TOK_SELECT(TOK_SELEXPR(TOK_FUNCTIONSTAR(count))))(TOK_WHERE(=(TOK_TABLE_OR_COL(month))(LITERAL)))))())",
"fs_bytes_written": 108,
"fs_bytes_read": 54673600000,
"frequency": 20,
"cpu": 416200,
"recent_example": "SELECT count(*) FROM default_qubole_memetracker where month = '2008-08';"
}
]
}
For a specific time period
curl -i -X GET -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Accept: application/json" "https://api.qubole.com/api/v1.2/reports/canonical_hive_commands?start_date=2014-04-01&end_date=2014-04-21&sort_column=fs_bytes_read&limit=2"
Note
The above syntax uses https://api.qubole.com as the endpoint. Qubole provides other endpoints to access QDS that are described in Supported Qubole Endpoints on Different Cloud Providers.
Sample Response
{
"canonical_queries": [
{
"canonical_query_id": "55ebb0cc47e0dc74c70245b026126bba191969dee1dc380a6f98698e6b194085",
"cpu": 75720,
"frequency": 1,
"recent_example": "select dt, count(*) from junk_temp \ngroup by dt order by dt\n\n\n",
"fs_bytes_read": 308582016,
"fs_bytes_written": 1514
},
{
"canonical_query_id": "1c421d2e65407650650cbc2ee80f9a59863875f52d1a9ddd5a051118678a3a6c",
"cpu": 34980,
"frequency": 1,
"recent_example": "select created_at from junk_temp \nwhere dt=2014-01-03 limit 10\n\n",
"fs_bytes_read": 308582016,
"fs_bytes_written": 0
}
],
"start_date": "2014-03-31T10:00:00Z",
"end_date": "2014-04-21T20:00:00Z",
"sort_column": "fs_bytes_read"
}
To learn more about canonicalization of hive queries, see this blog post.
Caution
Qubole started collecting the CPU metrics only from the last week of December, 2013. So, if you have some queries before that, the CPU metrics is considered as 0.