Configuring the ODBC Driver

A Data Source Name (DSN) is a logical name that identifies a database and the set of connection attributes used to connect to the database. Configure the DSN in the ODBC driver.

Prerequisites

These are the prerequisites:

  • Ensure that you have the access to the object/resource that you refer to in the driver configuration. If you see an access denied error, check with the Qubole account administrator. For more information, see Resources, Actions, and What they Mean.

Types of DSN

There are two types of DSNs that you can configure:

  • System DSN
  • User DSN

The /etc/odbcinst.ini file contains information about ODBC drivers available to users and the /etc/odbc.ini file contains information about DSNs available to all users. These System DSNs are useful for applications such as web servers that may not be running as a real user and so there is no home directory to contain an .odbc.ini file. It is a good practice to have DSN defined for each user as it does not affect the system configuration.

Creating a User-level DSN

If you are using ODBC driver for Linux version 2.2.1 (and later versions), then you must create user-level DSNs yourselves as the driver only creates a system-level DSN during its installation.

To create a user DSN, copy odbc.ini, odbcinst.ini, and qubole.odbc.ini configuration files from the \etc\ directory to the home directory (~), change configuration as required and make these files hidden.

Configuring a User-level DSN

To configure a user DSN, set the ODBC environment variables in the user profile. Add an entry into .bash_profile or .profile to initialize the user environment during a bash login.

If you are using iODBC- Insert ODBCINI and ODBCINSTINI, configure the corresponding files as illustrated here.

[ec2-user@aws-instance ~]$ cat .bash_profile
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/.local/bin:$HOME/bin
ODBCINI=$HOME/.odbc.ini
ODBCINSTINI=$HOME/.odbcinst.ini
export PATH
export ODBCINI
export ODBCINSTINI

If you are using unixODBC- Insert ODBCSYSINI and ODBCINSTINI, configure the corresponding files as illustrated here.

[ec2-user@aws-instance ~]$ cat .bash_profile
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
 . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/.local/bin:$HOME/bin
ODBCSYSINI=$HOME/.odbc.ini
ODBCINSTINI=$HOME/.odbcinst.ini
export PATH
export ODBCSYSINI
export ODBCINSTINI

Configuring the System DSN

The Qubole ODBC driver comes with a default DSN that you can see in the the odbc.ini file. After you install the driver, you can trace the .odbc.ini file in the /usr/local/qubole/ location. You can add another DSN or modify the existing DSN property values. For the Qubole ODBC driver, these are the defaults in the .odbc.ini file.

Caution

STREAM_RESULTS mentioned below is part of the ODBC Driver beta versions 2.0.0 and 2.1.0. The beta versions can cause data loss in scenarios where results data has NULL characters (0x00) and are therefore not recommended for production usage. For more information, see Launch Stage and Availability of Features.

[ODBC]
# Specify any global ODBC configuration here such as ODBC tracing.

[ODBC Data Sources]
Qubole ODBC Driver DSN=QuboleODBC

[Qubole ODBC Driver DSN]
# This key is not necessary and it is to just describe the data source.
Description=Qubole ODBC Driver DSN

# Driver: The location where the ODBC driver is installed to.
Driver=/usr/local/qubole/libquboleodbc.so
# APITOKEN: The API Token for the account as provided by Qubole.
APITOKEN=""
# CLUSTER_LABEL: The label of the cluster on which queries are to be run.
CLUSTER_LABEL=default
# ENDPOINT: The Qubole endpoint for the account as provided by Qubole.
ENDPOINT=https://api.qubole.com
#DSI: The data source name : Hive, Spark, Presto, or SQLCommand (for Quantum). The cluster label is not mandatory for Quantum.
DSI=hive
APPID=""
# Catalog name
CATALOG_NAME=hive
#S3 Bucket region
BUCKET_REGION=us-east-1
#Bypass QDS Control Plane
USE_S3=true
#Enable Presto FastStreaming
STREAM_RESULTS=true

In the .odbc.ini file, you can edit any DSN property’s value. You can change these DSN properties as per the requirements:

  • APITOKEN: Enter The API token of the QDS account.

    Note

    To find the API token, navigate to Control Panel on the Qubole UI and click the My Accounts tab. Click Show for the account and copy the API token that is displayed. For more information, see Managing Your Accounts.

  • CLUSTER_LABEL: The cluster label is default, which implies that it is default cluster of the account. You can change it by adding the cluster’s label that you want to use.

  • ENDPOINT: It is the QDS environment. By default, it is https://api.qubole.com. To change it to a different QDS environment on which you have the QDS account, add that QDS environment as the endpoint. For more information on the endpoints, see Supported Qubole Endpoints on Different Cloud Providers.

  • DSI: It is the Data Source Interface. By default, it is Hive. You can change it to Presto, SQLCommand (for Quantum) or Spark as required. The cluster label is not mandatory for Quantum.

  • APPID: When you set the Data Source Name as Spark, enter the APP ID. It is not applicable to the Presto or Hive data source.

  • CATALOG_NAME: Add this property and enter the catalog’s name as its value.

  • USE_S3: This property is enabled/set to true by default. It is set to bypass QDS Control Plane for getting results directly from the S3 location. You can enter false later to disable it or remove the keyword.

  • BUCKET_REGION: Add this property as a keyword and enter the AWS region which you use as the cloud object storage for final results (default storage location (defloc)). If you do not specify the AWS region, then it defaults to us-east-1 AWS region.

    Caution

    STREAM_RESULTS mentioned below is part of the ODBC Driver beta versions 2.0.0 and 2.1.0. The beta versions can cause data loss in scenarios where results data has NULL characters (0x00) and are therefore not recommended for production usage. For more information, see Launch Stage and Availability of Features.

  • STREAM_RESULTS: Set it to true to enable Presto FastStreaming. It enables streaming of results directly from AWS S3 in the ODBC driver. This is in contrast to the earlier behavior wherein the driver waits for the query to finish before downloading any results from the QDS Control Plane or from S3. The streaming behavior can help the BI tool performance as results are displayed as soon as they are available in S3. Presto FastStreaming for ODBC driver is supported in Presto versions 0.193 and 0.208. As streaming cannot be used with Presto Smart Query Retry, the Presto FastStreaming feature automatically disables Presto Smart Query Retry.

    Note

    Create a ticket with Qubole Support to enable the Presto FastStreaming feature on the account.

  • VIRTUAL_HOST_STYLE: It is enabled by default. Set it to false to enable path-style URLs that allow a dot in the (.) S3 bucket name of the default storage location (defloc). It is only supported from ODBC Driver version 2.2.1.

    For more information on advanced configuration, see Setting Additional Configuration.

Configuring a Multi-user DSN

Configure a multi-user DSN by editing a .odbc.ini file in the home directory of the current user ($HOME).

Caution

STREAM_RESULTS mentioned below is part of the ODBC Driver beta versions 2.0.0 and 2.1.0. The beta versions can cause data loss in scenarios where results data has NULL characters (0x00) and are therefore not recommended for production usage. For more information, see Launch Stage and Availability of Features.

Create a DSN into the section ODBC Data Sources as mentioned below:

[ODBC Data Sources]
Qubole ODBC Driver DSN=QuboleODBC
Test=QuboleODBC

[Test]
Description=Qubole ODBC Driver DSN
# Driver: The location where the ODBC driver is installed.
Driver=/usr/local/qubole/libquboleodbc.so
# APITOKEN: The API Token for the account as provided by Qubole.
APITOKEN="<API Token>"
# CLUSTER_LABEL: The label of the cluster on which queries are to be run.
CLUSTER_LABEL=presto
# ENDPOINT: The Qubole endpoint for the account as provided by Qubole.
ENDPOINT=https://us.qubole.com
#DSI: The data source name: Hive, Spark, Presto, or SQLCommand (for Quantum). The cluster label is not mandatory for Quantum.
DSI=hive
APPID=""
# Catalog name
CATALOG_NAME=hive
#S3 Bucket region
BUCKET_REGION=us-west-1
#Bypass QDS Control Plane
USE_S3=true
#Enable Presto FastStreaming
STREAM_RESULTS=true
#Enable Path-style URLs
VIRTUAL_HOST_STYLE=false