Configuring the ODBC Driver

A Data Source Name (DSN) is a logical name that identifies a database and the set of connection attributes used to connect to the database. Configure the DSN in the ODBC driver.

The following subtopics cover the configuration-related information:

Prerequisites

These are the prerequisites:

  • Ensure that you have the access to the object/resource that you refer to in the driver configuration. If you see an access denied error, check with the Qubole account administrator. For more information, see Resources, Actions, and What they Mean.

Types of DSN

There are two types of DSNs that you can configure:

  • System DSN
  • User DSN

The /etc/odbcinst.ini file contains information about ODBC drivers available to users and the /etc/odbc.ini file contains information about DSNs available to all users. These System DSNs are useful for applications such as web servers that may not be running as a real user and so there is no home directory to contain an .odbc.ini file. It is a good practice to have DSN defined for each user as it does not affect the system configuration.

Creating a User-level DSN

If you are using ODBC driver for Linux version 2.2.1 and later versions, then you must create user-level DSNs later as the driver only creates a system-level DSN during its installation.

To create a user DSN, copy odbc.ini, odbcinst.ini, and qubole.odbc.ini configuration files from the \etc\ directory to the home directory (~), change configuration as required and make these files hidden.

Configuring a User-level DSN

To configure a user DSN, set the ODBC environment variables in the user profile. Add an entry into .bash_profile or .profile to initialize the user environment during a bash login.

If you are using iODBC- Insert ODBCINI and ODBCINSTINI, configure the corresponding files as illustrated here.

[ec2-user@aws-instance ~]$ cat .bash_profile
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/.local/bin:$HOME/bin
ODBCINI=$HOME/.odbc.ini
ODBCINSTINI=$HOME/.odbcinst.ini
export PATH
export ODBCINI
export ODBCINSTINI

If you are using unixODBC- Insert ODBCSYSINI and ODBCINSTINI, configure the corresponding files as illustrated here.

[ec2-user@aws-instance ~]$ cat .bash_profile
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
 . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/.local/bin:$HOME/bin
ODBCSYSINI=$HOME/.odbc.ini
ODBCINSTINI=$HOME/.odbcinst.ini
export PATH
export ODBCSYSINI
export ODBCINSTINI

Configuring the System DSN

The Qubole ODBC driver comes with a default DSN that you can see in the the odbc.ini file. After you install the driver, you can trace the .odbc.ini file in the /usr/local/qubole/ location. You can add another DSN or modify the existing DSN property values. For the Qubole ODBC driver, these are the defaults in the .odbc.ini file.

Note

Turning off QDS_BYPASS enables the legacy mode in which the ODBC driver defaults to the driver version 2.1.1. Set QDS_BYPASS to false to enable the legacy mode.

[ODBC]
# Specify any global ODBC configuration here such as ODBC tracing.

[ODBC Data Sources]
Qubole ODBC Driver DSN=QuboleODBC

[Qubole ODBC Driver DSN]
# This key is not necessary and it is to just describe the data source.
Description=Qubole ODBC Driver DSN

# Driver: The location where the ODBC driver is installed to.
Driver=/usr/local/qubole/libquboleodbc.so
# APITOKEN: The API Token for the account as provided by Qubole.
APITOKEN=""
# CLUSTER_LABEL: The label of the cluster on which queries are to be run. The driver in its QDS_bypass mode (default)
# only supports Presto. The driver in its legacy mode supports Hive, Presto, and Spark.
CLUSTER_LABEL=default
# ENDPOINT: The Qubole endpoint for the account as provided by Qubole.
ENDPOINT=https://api.qubole.com
#DSI: The data source name is only Presto in the QDS_bypass mode (default). The data source name is Hive, Spark,
# or Presto when the driver is in legacy mode.
DSI=presto
#App ID is only for Spark in the legacy mode
APPID=""
# Schema name
SCHEMA_NAME=default
# Catalog name
CATALOG=hive
#S3 Bucket region. The driver only supports it in the legacy mode.
BUCKET_REGION=us-east-1
#Bypass QDS Control Plane. The driver only supports it in the legacy mode.
USE_S3=true
#Enable Presto FastStreaming. The driver only supports it in the legacy mode.
STREAM_RESULTS=true
#The QDS BYPASS mode is enabled by default in the QDS Bypass mode. Disable it to use it in the legacy mode for Hive/Spark queries.
QDS_BYPASS=true
#SHOW_ON_UI is enabled by default in the QDS Bypass mode. Disable it to use it in the legacy mode for Hive/Spark queries.
SHOW_ON_UI=true

In the .odbc.ini file, you can edit any DSN property’s value. You can change these DSN properties as per the requirements:

Note

Turning off QDS_BYPASS enables the legacy mode in which the ODBC driver defaults to the driver version 2.1.1. Set QDS_BYPASS to false to enable the legacy mode.

  • APITOKEN: Enter The API token of the QDS account.

    Note

    To find the API token, navigate to Control Panel on the Qubole UI and click the My Accounts tab. Click Show for the account and copy the API token that is displayed. For more information, see Managing Your Accounts.

  • CLUSTER_LABEL: The cluster label is default, which implies that it is default cluster of the account. You can change it by adding the cluster’s label that you want to use.

  • ENDPOINT: It is the QDS environment. By default, it is https://api.qubole.com. To change it to a different QDS environment on which you have the QDS account, add that QDS environment as the endpoint. For more information on the endpoints, see Supported Qubole Endpoints on Different Cloud Providers.

  • DSI: It is the Data Source Interface. It supports only Presto in the QDS Bypass mode.

    Only in the legacy mode, you can set it to Hive or Spark as required in the legacy mode.

  • APPID: When you set the Data Source Name as Spark, enter the APP ID. It is not applicable to the Presto or Hive data source. The driver only supports this property in the legacy mode.

  • SCHEMA_NAME:Enter the schema name as its value to filter to that schema while fetching metadata so that only the related TABL_SCHEM configured is exposed.

  • CATALOG: Add this property and enter the catalog’s name as its value.

  • USE_S3: This property is enabled/set to true by default. It is set to bypass QDS Control Plane for getting results directly from the S3 location. You can enter false later to disable it or remove the keyword. The driver only supports this property in the legacy mode.

  • BUCKET_REGION: Add this property as a keyword and enter the AWS region which you use as the cloud object storage for final results (default storage location (defloc)). If you do not specify the AWS region, then it defaults to us-east-1 AWS region. The driver only supports this property in the legacy mode.

  • STREAM_RESULTS: Set it to true to enable Presto FastStreaming. It enables streaming of results directly from AWS S3 in the ODBC driver. This is in contrast to the earlier behavior wherein the driver waits for the query to finish before downloading any results from the QDS Control Plane or from S3. The streaming behavior can help the BI tool performance as results are displayed as soon as they are available in S3. Presto FastStreaming for ODBC driver is supported in Presto versions 0.193 and 0.208. As streaming cannot be used with Presto Smart Query Retry, the Presto FastStreaming feature automatically disables Presto Smart Query Retry. The driver only supports this property in the legacy mode.

    Note

    Create a ticket with Qubole Support to enable the Presto FastStreaming feature on the account.

  • QDS_BYPASS: This property is only applicable to the third-generation ODBC driver’s QDS BYPASS mode.

    Note

    Currently, this property only applies to AWS. Do not set this option for the Azure cloud endpoint, azure.qubole.com.

    By default, this property is set to true to allow the driver to directly communicate with Presto master for submitting commands and fetching results. If you do not set this property, then the driver behaves as the ODBC driver in the legacy mode.

  • SHOW_ON_UI: This property is only applicable to the third-generation ODBC driver’s QDS BYPASS mode. By default, this property is set to true to display the queries on Qubole’s UI. It returns QueryHistID as part of the error message for queries executed through the third-generation ODBC driver.

Modes of ODBC Driver Version 3.0 for Linux

The third-generation ODBC driver supports two modes:

  • Legacy mode. It is when QDS_BYPASS is set to false. The driver behaves just like its previous versions 2.1.1.
  • QDS-bypass mode. It is when QDS_BYPASS is set to true.

The following table shows the supported/unsupported properties of the ODBC driver version 3.0 in the QDS-Bypass mode and the previous version 2.1.1.

Property Name Version 2.1.1/Legacy mode in version 3 QDS-Bypass Mode in version 3
APITOKEN ../../../../_images/tick-mark1.png ../../../../_images/tick-mark1.png
CLUSTER_LABEL ../../../../_images/tick-mark1.png ../../../../_images/tick-mark1.png
ENDPOINT ../../../../_images/tick-mark1.png ../../../../_images/tick-mark1.png
DSI ../../../../_images/tick-mark1.png ../../../../_images/tick-mark1.png
APPID ../../../../_images/tick-mark1.png ../../../../_images/cross-mark1.png
SCHEMA_NAME ../../../../_images/tick-mark1.png ../../../../_images/tick-mark1.png
CATALOG ../../../../_images/tick-mark1.png ../../../../_images/tick-mark1.png
USE_S3 ../../../../_images/tick-mark1.png ../../../../_images/cross-mark1.png
BUCKET_REGION ../../../../_images/tick-mark1.png ../../../../_images/cross-mark1.png
STREAM_RESULTS ../../../../_images/tick-mark1.png ../../../../_images/cross-mark1.png
QDS_BYPASS (v3) ../../../../_images/cross-mark1.png ../../../../_images/tick-mark1.png
SHOW_ON_UI (v3) ../../../../_images/cross-mark1.png ../../../../_images/tick-mark1.png

Configuring a Multi-user DSN

Configure a multi-user DSN by editing a .odbc.ini file in the home directory of the current user ($HOME).

Create a DSN into the section ODBC Data Sources as mentioned below:

[ODBC Data Sources]
Qubole ODBC Driver DSN=QuboleODBC
Test=QuboleODBC

[Test]
Description=Qubole ODBC Driver DSN
# Driver: The location where the ODBC driver is installed.
Driver=/usr/local/qubole/libquboleodbc.so
# APITOKEN: The API Token for the account as provided by Qubole.
APITOKEN="<API Token>"
# CLUSTER_LABEL: The label of the cluster on which queries are to be run.
CLUSTER_LABEL=presto
# ENDPOINT: The Qubole endpoint for the account as provided by Qubole.
ENDPOINT=https://us.qubole.com
#DSI: The data source name is only Presto in the QDS_bypass mode (default). The data source name is Hive, Spark,
# or Presto when the driver is in legacy mode.
DSI=presto
#APP ID is only for Spark in legacy mode
APPID=""
#Schema name
SCHEMA_NAME=default
# Catalog name
CATALOG=presto
#S3 Bucket region. The driver only supports it in the legacy mode.
BUCKET_REGION=us-east-1
#Bypass QDS Control Plane. The driver only supports it in the legacy mode.
USE_S3=true
#Enable Presto FastStreaming. The driver only supports it in the legacy mode.
STREAM_RESULTS=true
#The QDS BYPASS mode is enabled by default in the QDS Bypass mode. Disable it to use it in the legacy mode for Hive/Spark queries.
QDS_BYPASS=true
#SHOW_ON_UI is enabled by default in the QDS Bypass mode. Disable it to use it in the legacy mode for Hive/Spark queries.
SHOW_ON_UI=true