4. How does Qubole access data in my Cloud object store?

Qubole accesses data using storage credentials added to the account’s configuration. In addition to this, Qubole accesses data in the following ways:

  • For Hive queries, Pig scripts, Hadoop jobs, and Presto queries, Qubole runs a Hadoop cluster on the machines that are rented by your account. The hadoop cluster reads, processes data, and writes the results back to your buckets. All the data is accessed on your machines.
  • When you browse or download results from Qubole’s website (UI or the API), the machines owned by Qubole will read the results from your object store and provide them to you.
  • For Data Import or Export commands, the data is transferred by a machine that runs within Qubole’s account. If you use Qubole’s cluster, data in your buckets are accessed by Qubole’s machines. But it is not a mandatory option for you. You can choose to use your own cluster.