Setting the JDBC Connection String

You must set the JDBC connection string for Hive, Presto, and Spark queries on the JDBC driver.

Note

The JDBC driver version 3.0 only supports Presto when the qds_bypass option is enabled.

Setting the Connection String for Hive and Presto Queries (AWS and Azure)

Note

The third-generation JDBC driver only supports Presto in its QDS-bypass mode.

Use the following syntax to set the JDBC connection string for Hive and Presto queries.

jdbc:qubole://<hive/presto/sqlcommand/spark>/<Cluster-Label>[/<database>][?propertyName1=propertyValue1[;propertyName2=propertyValue2]...]

In the connection string, <hive/presto/spark> (command type) and the cluster label are mandatory; database name and property name/value are optional.

However, the third-generation JDBC driver in its QDS_bypass mode only supports Presto. So, in the connection string, <presto> (command type) and the cluster label are mandatory; database name and property name/value are optional.

Note

If you do not specify a database, then in the query, specify either the database or fully-qualified table names.

An example of a connection string for Hive query is mentioned below (applicable to JDBC driver 2.3.2 and older versions).

jdbc:qubole://hive/default/tpcds_orc_500?endpoint=https://api.qubole.com;chunk_size=86

In the above example, https://api.qubole.com is one of the QDS endpoints on AWS. For a list of supported endpoints, see Supported Qubole Endpoints on Different Cloud Providers.

Connection String Properties for JDBC Driver

Note

Parameters marked in bold below are mandatory. Others are optional and have default values.

The JDBC driver version 3.0 only supports Presto when the qds_bypass option is enabled.

Property Name Property Description
password

You can set the account API token as the password as in password=<API token> See Managing Your Accounts on how to get the API token from the Control Panel UI of the account.

Warning

Qubole highly recommends not using the password in the JDBC connection string as the password is prone to be exposed by the client tool that uses the string for connecting to Qubole. So, as a safe alternative, use the interface that the client tool provides to enter the user password.

endpoint The endpoint is not required only for the https://api.qubole.com endpoint. You must specify the API endpoint for other QDS-on-AWS endpoints and Cloud providers. For the list, see Supported Qubole Endpoints on Different Cloud Providers.
chunk_size The chunk size in MB and used in streaming large results from the Cloud storage. The default value is 100 MB. Reduce the default value if you face out-of-memory (OOM) issues.
catalog_name Add this property and enter the catalog’s name as its value.
skip_parsing Set this property to true to allow the driver to skip parsing the query and directly send it to QDS.
stream_results

It enables the Presto FastStreaming feature. Presto FastStreaming enables streaming of results directly from Cloud Object Storage in the JDBC driver. The streaming behavior can help the BI tool performance as results are displayed as soon as they are available in Cloud Object Storage. Presto FastStreaming for JDBC driver is supported in Presto versions 0.193 and 0.208. It is only applicable to QDS-on-AWS and Qubole-on-GCP. As streaming cannot be used with Presto Smart Query Retry, the Presto FastStreaming feature automatically disables Presto Smart Query Retry.

Create a ticket with Qubole Support to enable the Presto FastStreaming feature on the account.

useS3 This property is set to ensure that JDBC Driver bypasses QDS Control Plane to download results directly from Cloud Object Storage. It is set to true by default. By default, for each QDS account, the result file size limit is 20 MB. If the result set size is more than 20 MB, the driver downloads results directly from Cloud Object Storage irrespective of the useS3 property’s value. If you want to increase this limit, create a ticket with Qubole Support.
qds_bypass This property is only applicable to the third-generation JDBC driver’s QDS bypass mode. Set this property to true to allow the driver to directly communicate with Presto coordinator for submitting commands and fetching results. If you do not set this property, then the driver behaves as JDBC driver 2.3.2 or older versions.
show_on_ui

This property is only applicable to the third-generation JDBC driver’s QDS bypass mode. Set this property to true if you want the queries to get displayed on Qubole’s UI. Setting this property returns QueryHistID as part of the error message for queries executed through

the third-generation JDBC driver.

Modes of JDBC Driver Version 3.0

The third-generation JDBC driver supports two modes:

  • Legacy mode. It is when qds_bypass is set to false. The driver behaves just like its earlier versions (2.3.2 and earlier versions).
  • QDS-bypass mode. It is when qds_bypass is set to true.

The following table shows the supported/unsupported properties of the JDBC driver version 3.0 in the QDS-Bypass mode and the previous version 2.3.2.

Property Name Version 2.3.2 QDS-Bypass Mode in version 3
password ../../../_images/tick-mark.png ../../../_images/tick-mark.png
endpoint ../../../_images/tick-mark.png ../../../_images/tick-mark.png
chunk_size ../../../_images/tick-mark.png ../../../_images/cross-mark.png
catalog_name ../../../_images/tick-mark.png ../../../_images/tick-mark.png
skip_parsing ../../../_images/tick-mark.png ../../../_images/cross-mark.png
stream_results ../../../_images/tick-mark.png ../../../_images/cross-mark.png
useS3 ../../../_images/tick-mark.png ../../../_images/cross-mark.png
qds_bypass (v3) ../../../_images/cross-mark.png ../../../_images/tick-mark.png
show_on_ui (v3) ../../../_images/cross-mark.png ../../../_images/tick-mark.png

Additional Properties (Optional)

In addition, you can:

Setting the Connection String for Spark Queries

Note

The JDBC driver version 3.0 only supports Presto when the qds_bypass option is enabled.

Use the following syntax to set the JDBC connection string for Spark queries.

jdbc:qubole://spark/<Cluster-Label>/<app-id>[/<database>][?propertyName1=propertyValue1[;propertyName2=propertyValue2]...]

For example:

jdbc:qubole://spark/spark-cluster/85/my-sql?endpoint=https://us.qubole.com;chunk_size=86;password=<API token>;useS3=true

Warning

Qubole highly recommends not using the password in the JDBC connection string as the password is prone to be exposed by the client tool that uses the string for connecting to Qubole. So, as a safe alternative, use the interface that the client tool provides to enter the user password.

Note

Create an App with the configuration parameter, zeppelin.spark.maxResult=<A VERY BIG VALUE>. It can return only the configured maximum number of row results.

In the connection string, spark (command type) and the cluster label are mandatory; database name and property name/value are optional.

Note

If you do not specify a database, then in the query, specify either the database or fully-qualified table names.

Specifying app-id is mandatory. An app is the main abstraction in the Spark Job Server API. It is used to store the configuration for a Spark application. Creating an app returns an app-id. (You can get the app id using GET API http://api.qubole.com/api/v1.2/apps). See Understanding the Spark Job Server for more information.