Setting the JDBC Connection String
You must set the JDBC connection string for Hive, Presto, and Spark queries on the JDBC driver.
Note
The JDBC driver version 3.0 only supports Presto when the qds_bypass
option is enabled.
Setting the Connection String for Hive and Presto Queries (AWS and Azure)
Note
The third-generation JDBC driver only supports Presto in its QDS-bypass mode.
Use the following syntax to set the JDBC connection string for Hive and Presto queries.
jdbc:qubole://<hive/presto/sqlcommand/spark>/<Cluster-Label>[/<database>][?propertyName1=propertyValue1[;propertyName2=propertyValue2]...]
In the connection string, <hive/presto/spark> (command type) and the cluster label are mandatory; database name and property name/value are optional.
However, the third-generation JDBC driver in its QDS_bypass mode only supports Presto. So, in the connection string, <presto> (command type) and the cluster label are mandatory; database name and property name/value are optional.
Note
If you do not specify a database, then in the query, specify either the database or fully-qualified table names.
An example of a connection string for Hive query is mentioned below (applicable to JDBC driver 2.3.2 and older versions).
jdbc:qubole://hive/default/tpcds_orc_500?endpoint=https://api.qubole.com;chunk_size=86
In the above example, https://api.qubole.com is one of the QDS endpoints on AWS. For a list of supported endpoints, see Supported Qubole Endpoints on Different Cloud Providers.
Connection String Properties for JDBC Driver
Note
Parameters marked in bold below are mandatory. Others are optional and have default values.
The JDBC driver version 3.0 only supports Presto when the qds_bypass
option is enabled.
Property Name |
Property Description |
---|---|
password |
You can set the account API token as the password as in Warning Qubole highly recommends not using the password in the JDBC connection string as the password is prone to be exposed by the client tool that uses the string for connecting to Qubole. So, as a safe alternative, use the interface that the client tool provides to enter the user password. |
endpoint |
The endpoint is not required only for the |
chunk_size |
The chunk size in MB and used in streaming large results from the Cloud storage. The default value is 100 MB. Reduce the default value if you face out-of-memory (OOM) issues. |
catalog_name |
Add this property and enter the catalog’s name as its value. |
skip_parsing |
Set this property to |
stream_results |
It enables the Presto FastStreaming feature. Presto FastStreaming enables streaming of results directly from Cloud Object Storage in the JDBC driver. The streaming behavior can help the BI tool performance as results are displayed as soon as they are available in Cloud Object Storage. Presto FastStreaming for JDBC driver is supported in Presto versions 0.193 and 0.208. It is only applicable to QDS-on-AWS and Qubole-on-GCP. As streaming cannot be used with Presto Smart Query Retry, the Presto FastStreaming feature automatically disables Presto Smart Query Retry. Create a ticket with Qubole Support to enable the Presto FastStreaming feature on the account. |
useS3 |
This property is set to ensure that JDBC Driver bypasses QDS Control Plane to download
results directly from Cloud Object Storage. It is set to |
qds_bypass |
This property is only applicable to the third-generation JDBC driver’s
QDS bypass mode. Set this property to |
show_on_ui |
This property is only applicable to the third-generation JDBC driver’s
QDS bypass mode. Set this property to
|
Modes of JDBC Driver Version 3.0
The third-generation JDBC driver supports two modes:
Legacy mode. It is when
qds_bypass
is set tofalse
. The driver behaves just like its earlier versions (2.3.2 and earlier versions).QDS-bypass mode. It is when
qds_bypass
is set totrue
.
The following table shows the supported/unsupported properties of the JDBC driver version 3.0 in the QDS-Bypass mode and the previous version 2.3.2.
Property Name |
Version 2.3.2 |
QDS-Bypass Mode in version 3 |
---|---|---|
password |
||
endpoint |
||
chunk_size |
||
catalog_name |
||
skip_parsing |
||
stream_results |
||
useS3 |
||
qds_bypass (v3) |
||
show_on_ui (v3) |
Additional Properties (Optional)
In addition, you can:
Enable Logging as described in Enabling Logging.
Enable Proxy as described in Enabling the Proxy Connection.
Setting the Connection String for Spark Queries
Note
The JDBC driver version 3.0 only supports Presto when the qds_bypass
option is enabled.
Use the following syntax to set the JDBC connection string for Spark queries.
jdbc:qubole://spark/<Cluster-Label>/<app-id>[/<database>][?propertyName1=propertyValue1[;propertyName2=propertyValue2]...]
For example:
jdbc:qubole://spark/spark-cluster/85/my-sql?endpoint=https://us.qubole.com;chunk_size=86;password=<API token>;useS3=true
Warning
Qubole highly recommends not using the password in the JDBC connection string as the password is prone to be exposed by the client tool that uses the string for connecting to Qubole. So, as a safe alternative, use the interface that the client tool provides to enter the user password.
Note
Create an App with the configuration parameter, zeppelin.spark.maxResult=<A VERY BIG VALUE>
. It can return
only the configured maximum number of row results.
In the connection string, spark
(command type) and the cluster label are mandatory; database name and
property name/value are optional.
Note
If you do not specify a database, then in the query, specify either the database or fully-qualified table names.
Specifying app-id
is mandatory. An app is the main abstraction in the Spark Job Server API. It is used to store
the configuration for a Spark application. Creating an app returns an app-id
. (You can get the app id
using
GET API http://api.qubole.com/api/v1.2/apps). See Understanding the Spark Job Server for more information.