Manual Setup (with Qubole script)

This section describes the manual process for setting up a Qubole account on GCP. The user running this setup will need access to a service account and its associated JSON credentials file with permissions specified below. Before beginning the setup process, be sure that you have the prerequisites described in Prerequisites and Signup.

Required Permissions for Setup

  • The user executing the manual script must be assigned the following IAM roles:

    • Service Account Key Admin: To generate JSON keys for service accounts.
    • Storage Admin: To create and assign read/write privileges on storage buckets.
  • The service account used to execute the manual script must have the following IAM roles:

    1. Service Account Admin: This role includes permissions for working with service accounts.
    2. Project IAM Admin: This role contains permissions to access and administer a project’s IAM policies.
    3. Role Administrator role: This role includes permissions to create custom roles.
  • Create a new bucket with bucket-level permissions enabled for the default storage location (defloc). From the permissions tab on the bucket details page, assign either the Storage Legacy Bucket Owner or the Storage Admin role on the defloc service account used to execute the setup:

    ../../../_images/39-give-bucket-role-to-sa.png

Note

At least one of the two roles Storage Legacy Bucket Owner and Storage Admin should appear in the IAM section of your GCP console. Add either of these roles to provide the permissions required for the default storage location.

Setup Process

  1. In the QDS UI, go to Control Panel > Account Settings > Access Settings. For Access Mode Type, select Manual.

    ../../../_images/03b-CreateQuboleServiceAccount.png
  2. Click Download account setup script to download the setup_service_account.sh script for the setup process.

  3. Leave the fields Compute Service Account and Instance Service Account blank. The service accounts will be created for you when you execute the setup script (see step 7), and these fields will be populated automatically.

  4. Enter your Project ID and the Default Location (defloc) for storing Qubole data and logs.

Note

Omit gs:// when specifying your defloc value here.

  1. You must provide the JSON credentials file corresponding to the service account you are using for this setup process as an input to the script. For information on creating a JSON credentials file, see Creating and managing service account keys in the GCP documentation.

  2. Upload your credentials file and the downloaded setup script to Cloud Shell. For information on uploading files to Cloud Shell, see Using the Session Window in the GCP documentation.

  3. Invoke the downloaded setup script using the source command. The script uses gcloud and gsutil command lines, which are available as a part of the Google Cloud SDK. You should execute the script from a Google Cloud Shell under your GCP account, since all required Google Cloud SDK packages come pre-installed with Cloud Shell. For information on how to invoke Cloud Shell, see Starting Cloud Shell in the GCP documentation.

    Enter these values specific to your GCP account into the setup script:

    • Your Qubole Service Account.
    • Your JSON credentials file.
    • Your project ID.
    • A default location (defloc) on Google Cloud storage that Qubole can use to store logs and processing output. This will be a URL of the form gs://<path-to-bucket>.

    Usage:

    source setup_service_accounts.sh \
    --qubole_sa=<qubole_service_account> \
    --credentials_file=<customer_json_credentials_file> \
    --project=<customer_ProjectID> \
    --defloc=<google_storage_bucket_for_qubole>
    

    Example:

    source setup_service_accounts.sh \
    --qubole_sa=[email protected] \
    --credentials_file=gcp-key.json \
    --project=qubole-gce \
    --defloc=gs://vs-test
    

In this example, gcp-key.json is the credentials file uploaded to the cloud shell.

The output of the setup script will look similar to the following. You can use this output to complete the fields in Access Settings fields in the QDS Control Panel:

Compute Service Account : [email protected]
Instance Service Account: [email protected]
Project ID              : qubole-gce
Default Location        : vs-test
Data Buckets            : vs-data-buckets,arkaraj-acm

Note

You can also display the values of the Compute Service Account and the Instance Service Account with the echo command in Cloud Shell:

  • A service account called Compute Service Account under the environment variable $COMPUTE_SERVICE_ACCOUNT_FOR_QUBOLE.
  • A service account called Instance Service Account under the environment variable $INSTANCE_SERVICE_ACCOUNT_FOR_QUBOLE.

Qubole recommends that you also store the service account names in a secure place.

  1. To use Google BigQuery, you must add two additional roles:

    • roles/bigquery.dataViewer on your Compute Service Account (CSA).
    • roles/bigquery.readSessionUser on your Instance Service Account (ISA).
  2. Click Save to finish the setup of your account.

  3. Validation of credentials after Save:

    • If your settings are saved successfully, you will see a message at the top of page saying, “Please wait while we validate your settings. This may take a few minutes.” Upon completion of the validation, your account will be fully operational.
    • Qubole validates your settings in the background, so you can use the application while the settings are being validated, but you will not be allowed to update the access settings or perform any GCP operations, such as starting a cluster.
    • Validation may take up to 5 minutes.
    • If validation is successful, you will see a green check mark in the Access Settings section next to the Default Location field. If validation fails, you will see a red X in the Access Settings section next to the Default Location field.

Adding Cloud Storage Buckets and Configuring Permissions

Cloud Storage buckets used with QDS must be configured to provide read/write access to your Compute Service Account and Instance Service Account. To add Cloud Storage storage buckets and configure access permissions, perform the following steps:

  1. In the Navigation menu of the GCP console, click Storage and then click Create Bucket.

    ../../../_images/51-storage-create-bucket.png
  2. Provide a name for the bucket and click Create.

  3. On the Bucket details screen, click the Permissions tab.

  4. Click Add members.

    ../../../_images/53-storage-bucket-add-members.png
  5. In the QDS UI, from the Access Settings section of the Control Panel, copy the names of your Compute Service Account and Instance Service Account.

  6. Paste the service account names into the New members field in the GCP console Add members screen.

  7. In the Role field, click the role selection dropdown list and select Custom > Custom Qubole Storage Role to assign the Qubole Storage Role to the new members.

    ../../../_images/52-storage-bucket-add-role.png
  8. Click Save.

Points to remember

  1. In manual setup (as with automated setup), Qubole creates two custom roles in your project: qbol_compute_role and qbol_storage_role. Do not modify or delete these roles from the project as doing so might lead to unexpected behavior.
  2. Every time access settings are saved in automated process, you must ensure that QSA has required permissions as mentioned above.