Setting the JDBC Connection String

You must set the JDBC connection string for Hive, Presto, and Spark queries on the JDBC driver.

Note

The JDBC driver version 3.0 only supports Presto when the qds_bypass option is enabled.

Setting the Connection String for Hive and Presto Queries (AWS and Azure)

Note

The third-generation JDBC driver only supports Presto in its QDS-bypass mode.

Use the following syntax to set the JDBC connection string for Hive and Presto queries.

jdbc:qubole://<hive/presto/sqlcommand/spark>/<Cluster-Label>[/<database>][?propertyName1=propertyValue1[;propertyName2=propertyValue2]...]

In the connection string, <hive/presto/spark> (command type) and the cluster label are mandatory; database name and property name/value are optional.

However, the third-generation JDBC driver in its QDS_bypass mode only supports Presto. So, in the connection string, <presto> (command type) and the cluster label are mandatory; database name and property name/value are optional.

Note

If you do not specify a database, then in the query, specify either the database or fully-qualified table names.

An example of a connection string for Hive query is mentioned below (applicable to JDBC driver 2.3.2 and older versions).

jdbc:qubole://hive/default/tpcds_orc_500?endpoint=https://api.qubole.com;chunk_size=86

In the above example, https://api.qubole.com is one of the QDS endpoints on AWS. For a list of supported endpoints, see Supported Qubole Endpoints on Different Cloud Providers.

Connection String Properties for JDBC Driver

Note

Parameters marked in bold below are mandatory. Others are optional and have default values.

The JDBC driver version 3.0 only supports Presto when the qds_bypass option is enabled.

Property Name	Property Description
password	You can set the account API token as the password as in `password=<API token>` See Managing Your Accounts on how to get the API token from the Control Panel UI of the account. Warning Qubole highly recommends not using the password in the JDBC connection string as the password is prone to be exposed by the client tool that uses the string for connecting to Qubole. So, as a safe alternative, use the interface that the client tool provides to enter the user password.
endpoint	The endpoint is not required only for the `https://api.qubole.com` endpoint. You must specify the API endpoint for other QDS-on-AWS endpoints and Cloud providers. For the list, see Supported Qubole Endpoints on Different Cloud Providers.
chunk_size	The chunk size in MB and used in streaming large results from the Cloud storage. The default value is 100 MB. Reduce the default value if you face out-of-memory (OOM) issues.
catalog_name	Add this property and enter the catalog’s name as its value.
skip_parsing	Set this property to `true` to allow the driver to skip parsing the query and directly send it to QDS.
stream_results	It enables the Presto FastStreaming feature. Presto FastStreaming enables streaming of results directly from Cloud Object Storage in the JDBC driver. The streaming behavior can help the BI tool performance as results are displayed as soon as they are available in Cloud Object Storage. Presto FastStreaming for JDBC driver is supported in Presto versions 0.193 and 0.208. It is only applicable to QDS-on-AWS and Qubole-on-GCP. As streaming cannot be used with Presto Smart Query Retry, the Presto FastStreaming feature automatically disables Presto Smart Query Retry. Create a ticket with Qubole Support to enable the Presto FastStreaming feature on the account.
useS3	This property is set to ensure that JDBC Driver bypasses QDS Control Plane to download results directly from Cloud Object Storage. It is set to `true` by default. By default, for each QDS account, the result file size limit is 20 MB. If the result set size is more than 20 MB, the driver downloads results directly from Cloud Object Storage irrespective of the `useS3` property’s value. If you want to increase this limit, create a ticket with Qubole Support.
qds_bypass	This property is only applicable to the third-generation JDBC driver’s QDS bypass mode. Set this property to `true` to allow the driver to directly communicate with Presto coordinator for submitting commands and fetching results. If you do not set this property, then the driver behaves as JDBC driver 2.3.2 or older versions.
show_on_ui	This property is only applicable to the third-generation JDBC driver’s QDS bypass mode. Set this property to `true` if you want the queries to get displayed on Qubole’s UI. Setting this property returns `QueryHistID` as part of the error message for queries executed through the third-generation JDBC driver.

Modes of JDBC Driver Version 3.0

The third-generation JDBC driver supports two modes:

Legacy mode. It is when qds_bypass is set to false. The driver behaves just like its earlier versions (2.3.2 and earlier versions).
QDS-bypass mode. It is when qds_bypass is set to true.

The following table shows the supported/unsupported properties of the JDBC driver version 3.0 in the QDS-Bypass mode and the previous version 2.3.2.

Property Name	Version 2.3.2	QDS-Bypass Mode in version 3
password
endpoint
chunk_size
catalog_name
skip_parsing
stream_results
useS3
qds_bypass (v3)
show_on_ui (v3)

Additional Properties (Optional)

In addition, you can:

Enable Logging as described in Enabling Logging.
Enable Proxy as described in Enabling the Proxy Connection.

Setting the Connection String for Spark Queries

Note

The JDBC driver version 3.0 only supports Presto when the qds_bypass option is enabled.

Use the following syntax to set the JDBC connection string for Spark queries.

jdbc:qubole://spark/<Cluster-Label>/<app-id>[/<database>][?propertyName1=propertyValue1[;propertyName2=propertyValue2]...]

For example:

jdbc:qubole://spark/spark-cluster/85/my-sql?endpoint=https://us.qubole.com;chunk_size=86;password=<API token>;useS3=true

Warning

Qubole highly recommends not using the password in the JDBC connection string as the password is prone to be exposed by the client tool that uses the string for connecting to Qubole. So, as a safe alternative, use the interface that the client tool provides to enter the user password.

Note

Create an App with the configuration parameter, zeppelin.spark.maxResult=<A VERY BIG VALUE>. It can return only the configured maximum number of row results.

In the connection string, spark (command type) and the cluster label are mandatory; database name and property name/value are optional.

Note

If you do not specify a database, then in the query, specify either the database or fully-qualified table names.

Specifying app-id is mandatory. An app is the main abstraction in the Spark Job Server API. It is used to store the configuration for a Spark application. Creating an app returns an app-id. (You can get the app id using GET API http://api.qubole.com/api/v1.2/apps). See Understanding the Spark Job Server for more information.