Changelog for api.qubole.com

Date and time of release Version Change type Change
21st May, 2020 (08:38 AM PST) 58.0.126 Enhancement EAM-2299: Now, the users have an option on the Scheduler page to specify a time interval to run the scheduler. To enable this, the following two new fields are added: Interval: The user needs to enter the numerical value for the time interval corresponding to the file path (as entered by the user in the File Location field). Increment: The user needs to select a time unit for the value specified in the Interval field. Both these are mandatory fields if the user selects Cron expression as the Frequency.
6th May, 2020 (05:25 AM PST) 58.0.113 Bug fix

ACM-6923: This is a bug fix to avoid the duplication of cluster nodes, which remained in the shutting down state in AWS.

ZEP-4446: Notebooks were getting lost when the note.json file was missing. This issue is fixed.

ZEP-4531: The Notebook content was getting lost or reverted after the Websocket reconnect. This issue is fixed.

4th May, 2020 (11:15 PM PST) 58.0.109 Bug fix JUPY-694: Spark session was terminated even when there were active tasks. This issue is fixed.
28th Apr, 2020 (01:18 AM PST) 58.0.106 Bug fix

HADTWO-2545: Qubole has backported YARN-9984 to Hadoop 3. It fixes the issue where ResourceManager got restarted due to NullPointerException in FSPreemptionThread.

HADTWO-2548: The NameNode used to go into the safe mode frequently as Ganglia consumed a lot of disk space in Hadoop 3. To fix this, Qubole has disabled container level metrics by default in Hadoop 3.

QHIVE-5265: The issue of clearing the logging context in HiveServer2, which led to operations log file descriptors’ leak is fixed. The related open-source Hive jira: HIVE-22733.

QHIVE-5318: The issue in a Hive ACID compaction job, which caused data loss in the event of job failure is fixed.

QTEZ-242: The issue where Tez autoscaling in a running Tez job did not occur due to lack of resources is fixed. The fix ensures that when the Number Of Pending Tasks is 0, then the AutoScaled ContainerRequest is always 0. There is no need to request for more containers when tasks are already running or finished.

QTEZ-513: When the ApplicationMaster (AM) on which the query is running shuts down abruptly, Tez has an inbuilt mechanism to recover to the preexisting state. When this happens, the AM restarts in the Tez DAG recovery mode and the state in which the AM that was shutdown is recovered in the new AM. Due to the open-source software bug that occurs during scheduling tasks in the Tez DAG recovery mode, the query gets stuck. So, Qubole has disabled the Tez DAG recovery by default.

24th Apr, 2020 (06:03 AM PST) 58.0.105 Bug fix AN-2722: Workbench now displays the complete cluster label as a tooltip when you hover over it.
22nd Apr, 2020 (03:09 AM PST) 58.0.99 Bug fix

PRES-3315: Fixed the parsing error that occurred when queries contained block comments.

PRES-3466: Fixed the issue in the node removal API that did not work in SSL-enabled Presto clusters.

PRES-3558: Fixed the issue that occurred while reading ORC data generated by the minor compaction of Hive ACID tables.

16th Apr, 2020 (04:40 AM PST) 58.0.94 Enhancement ACM-5019: Qubole lets you configure the master instance type in a multi-instance HiveServer2 cluster.
Bug fix ACM-6797: Fixed the bug where a specific cluster settings’ history in the UI was displayed incorrectly.
13th Apr, 2020 (05:29 AM PST) 58.0.91 Bug fix

PRES-3322: It fixes the bug in the Hive strict mode, which incorrectly blocked certain queries on partitioned columns with optimize_metadata_queries enabled on Presto 0.208.

PRES-3362: A Presto query with the Create table statement failed when the external location was not a directory only when the Presto version was 317. This issue is fixed now.

PRES-3442: The issue where Presto queries on ORC tables failed has been resolved. Qubole has now increased the default ORC decompression buffer from 4 MB to 16 MB. It has also introduced a new session property to set decompression buffer to any size.

Session property on Presto 0.208: hive.orc_decompression_max_buffer_size

Session property on Presto 317: hive.orc_max_decompression_buffer_size

PRES-3497: Presto queries failed as the user didn’t have write permissions to /media/ephemeral0. This issue is resolved now.

PRES-3510: Fixed a bug in parsing and splitting queries where inline comments had quotes.

PRES-3512: The issue where Presto queries failed while processing empty temp $folder directories is resolved now.

PRES-3513: Fixed the Equi criteria are empty, so INNER join should not have PARTITIONED distribution type error during the planning phase for certain queries involving multiple joins when dynamic filtering is enabled.

PRES-3517: The issue where a Presto query failed with java.lang.NullPointerException due to missing statistics in Hive metastore is resolved now.

PRES-3522: Fixed the SHOW GRANTS query failure on Presto 317.

PRES-3523: Fixed the Failing with isImpersonationEnabled is not supported by Glue error that occurred with Presto 317 and Glue by back porting the fix for the similar error from open source.

PRES-3524: The issue where a Presto query on tables with nested storage directories returned empty results is resolved now.

PRES-3533: Parquet binary statistics generated before PARQUET-251 were corrupted. As a fix, Presto will ignore Parquet binary statistics corrupted before PARQUET-251. This fix is backported into Presto 317.

9th Apr, 2020 (01:58 AM PST) 58.0.88 Enhancement QHIVE-5049: Qubole has removed hive.qubole.dynpart.track.s3 and hive.qubole.dynpart.track.cloudfs, which are S3 eventual consistency configuration properties in Hive 3.1.1 (beta). The eventual consistency is handled through a different mechanism.
3rd Apr, 2020 (12:56 AM PST) 58.0.84 Enhancement QUEST-608: Quest, a Data Engineering product offered by Qubole is renamed as Qubole Pipelines Service. The Quest UI is now called as Pipelines UI.
31st Mar, 2020 (09:08 AM PST) 58.0.81 Bug fix HADTWO-2144: A Hive-3.1.1 (beta) cluster works only with the s3a filesystem. If a Hive-3.1.1 (beta) cluster did not start up, then check with Qubole Support if the s3a filesystem is enabled on the account.
22nd Mar, 2020 (11:45 PM PST) 58.0.71 Enhancement

PRES-3360: Qubole has added a Datadog alert to detect runaway splits occupying execution slots for more than 10 minutes.

PRES-3404: Qubole has improved utilization of dynamic filters on worker nodes and reduced load on coordinator when dynamic filtering is enabled.

PRES-3429: QDS Presto version 317 is generally available now.

PRES-3469: Qubole has backported OS fixes to improve performance of inequality JOINs that involve BETWEEN and GROUP BY queries.

Bug fix

PRES-2481: Qubole has increased the default value of scheduler.max-requests-queued-per-destination to reduce occurrences of Max requests per destination 1024 exceeded for HttpDestination exceptions under high concurrency.

PRES-3108: Presto queries failed with denied authorization to Hive metastore. To fix this issue, Qubole has added a new configuration property called `hive.metastore.thrift.impersonation.enabled. You must set it to true for impersonation support. This helps in cases where authorization is enabled on the Hive metastore and actual user running a query should be used to send requests to the Hive metastore.

PRES-3403: The issue where a Presto query with the WHERE clause on TEXT files returned empty result is fixed. Presto can now read from Hive partitions which have location pointing to a file. This is a not recommended practice though given its limitations. Instead, you should point partition location to a directory.

PRES-3411: Fixed UnsupportedOperationException faced with certain multi-JOIN queries with Dynamic Filtering enabled. This is a back-port of an open-source fix.

PRES-3443: SELECT * from system.jdbc.columns Presto queries failed with IndexOutOfBoundsException. To identify the problematic schema, Qubole has handled runtime exceptions and added the schema name in the exception stacktrace. You can fix the problematic schema after referring to logs and successfully execute SELECT * from system.jdbc.columns queries.

19th Mar, 2020 (04:27 AM PST) 58.0.70 Enhancement AN-1430: When the number of columns in the Results pane is greater than 30, only the first 30 columns are rendered. You can use the Column drop-down list to select any (other) 30 columns to view. Additionally, the entire result set is available for download. Beta
Bug fix AN-2510: While pinning custom buckets in Workbench, you can now pick a different region to list bucket contents. Beta
17th Mar, 2020 (03:18 AM PST) 58.0.68 Bug fix JUPY-672: The Spark application status was not displayed in the Jupyter notebook panel menu. This issue is fixed.
12th Mar, 2020 (11:22 AM PST) 58.0.64 Enhancement ZEP-4327: Precaching for Table Explorer (Datasets) is now available in Zeppelin notebooks. Via Support.
Bug fix

ZEP-4336: Paragraphs failed with the NameError: name 'os' is not defined error in Zeppelin 0.8.0. The default python imports (os, json, re, traceback, and getopt) are added in Zeppelin 0.8.0 Pyspark interpreter to fix this issue.

ZEP-4321:Table Explorer was consuming high amount of memory when loading a large number of tables. As a result, notebooks page was unresponsive. This issue is fixed.

ZEP-4290: Notebook command failures with the undefined method 'body' for nil:NilClass error have been fixed.

ZEP-4401: The paragraph run time in the Notebooks UI was displayed incorrectly. This issue is fixed.

ZEP-4378: HTML tags were not rendered under markdown paragraph. This issue is fixed.

ZEP-4310: Jobs were failing due to high logging late in the data-driven log filtering tool. This issue is fixed by adding async processing queue for the data-driven log filtering tool.

10th Mar, 2020 (07:56 AM PST) 58.0.61 Enhancement

ACM-6564: Added the m5a, m5ad, r5a and r5ad instance types for all the supported regions.

ACM-6585: Added the c5.12xlarge and c5.24xlarge instance types for all the supported regions.

JUPY-667: In case of isolated mode, the livy session name is unique to facilitate multiple executions of a notebook in parallel.

Bug fix

JUPY-654: Jupyter notebook commands were being marked as failed although the notebooks ran successfully. This issue is fixed.

JUPY-659: The create, update, and rename APIs of Jupyter notebooks were not handling few special characters in name validation. This issue is fixed.

JUPY-665: Jupyter notebook execution failed when more than 10 notebooks were run from the scheduler in parallel. This issue is fixed.

5th Mar, 2020 (05:56 AM PST) 58.0.59 New feature QUEST-579: Quest is now available as a BETA feature for all the user accounts. Beta
Enhancement

QUEST-350: Users can now see the command logs of the test run in the Events tab. Details such as started, queued, running, cancelled or stopped are displayed. Additionally, an event is also displayed when the connection to Kafka source is not established.

QUEST-502: Users have an option to add the timestamp column in their data frame for Kafka and Kinesis as sources.

QUEST-492: When creating assisted pipelines, users do not have to specify a name of the connection when adding source and sink. The Name your connection field is removed from the Source and Sink section of the Quest UI.

QUEST-500: Pipelines can be in one of the following defined states:

  • DRAFT
  • WAITING
  • ACTIVE
  • PAUSED
  • ERROR
  • ARCHIVED
  • STOPPING

Learn more.

QUEST-501: Users can now delete an operator sequentially starting from the last operator added in Assisted pipeline mode from the Quest UI and by using the REST API.

For more information about the REST API, see Delete Pipeline Operator.

QUEST-512: Users can now edit the pipelines that are in the running state. If the user edits a running pipeline that was created by using the assisted mode, the running pipeline is opened for edit in the BYOC mode. After editing the pipeline, the user must re-deploy the pipeline for the changes to take effect.

Users can discard all the edits or changes made after the pipeline was started by using the Discard option in the UI.

Bug fix QUEST-600: Name of the cloned pipelines is now appended with the siblings count of the parent pipeline. For example, <parent_pipeline_name>_clone_1.
4th Mar, 2020 (01:12 PM PST) 58.0.58 Bug fix

ACM-6397: The issue in which fetching engine configuration caused command failures is resolved now.

ACM-6515: Fixed a bug where a local node bootstrap file was deleted and hence it did not get executed.

ACM-6531: The issue where clusters with version R56 got terminated due to health check failure is resolved now.

QHIVE-5118: You can use the Beeline client to escape control characters in results with a HiveServer2 (HS2) cluster. For more information on the Beeline command shell, see the Apache documentation, HiveServer 2 Clients: Beeline - Command Line Shell.

When this feature is enabled, SELECT queries with the LIMIT clause less than or equal to 999 do not launch a Hadoop job on the cluster. You can get this feature enabled by contacting Qubole Support. Via Support

QHIVE-5194: Fixed the issue which caused an error where query results were displayed in the ASCII format when the Hive table data was in the Parquet format.

25th Feb, 2020 (07:19 AM PST) 58.0.54 Bug fix HADTWO-2367: Qubole has backported HDFS-11499 to fix the issue in HDFS where DFS clients could not close a file as its last block did not have sufficient number of replicas. For more details, see HDFS-11499 and HDFS-11486.
24th Feb, 2020 (06:20 AM PST) 58.0.52 Bug fix

JUPY-621: The Spark progress widget in Jupyter Notebooks displayed a harmless exception sometimes. This issue is fixed.

JUPY-636: Python kernels were not running with the environment attached to the cluster. This issue is fixed.

18th Feb, 2020 (01:07 AM PST) 58.0.49 Bug fix

PRES-3200: Presto was not supporting Hash in Ranger with Presto 0.208. Now it supports Hash in Ranger with Presto 0.208.

PRES-3242: Parquet binary statistics generated before PARQUET-251 were corrupted. To resolve this, Presto ignores Parquet binary statistics generated before PARQUET-251 was fixed.

PRES-3371: The issue where the Presto QueryInfo for a query was not retrievable is resolved now.

PRES-3421: Presto has improved CBO logic to prefer broadcast join in some queries for better performance.

13th Feb, 2020 (07:09 AM PST) 58.0.46 Bug fix ZEP-4330: Clusters start-up time had increased for the accounts with proxy settings as packages were taking a long time to load. This issue is fixed.
11th Feb, 2020 (01:50 AM PST) 58.0.45 Bug fix ZEP-4322: The changes in notebook paragraphs were not saved and the paragraphs were reverting to the previous state. This issue is fixed.
9th Feb, 2020 (07:54 AM PST) 58.0.43 Bug fix ACM-6456: To avoid unavailability of spot nodes or any interruptions in spot nodes’ availability, set required IAM permissions as defined in Additional Permissions. This ensures better fulfillment and resiliency in spot requests. Qubole has restored the fallback to use request-spot-instances in the absence of permissions to call ec2-spot-fleet for heterogeneous spot clusters.
7th Feb, 2020 (12:33 PM PST) 58.0.42 Bug fix HADTWO-2413: Inline credentials were missing from output’s results when a user queried an s3 path (with inline credentials) through s3a filesystem. This issue has been fixed now.
6th Feb, 2020 (12:37 PM PST) 58.0.41 Bug fix PRES-3419: Updating and pushing configuration operations failed for non-Presto clusters that were on R57 and earlier versions due to validation of the Presto version. This issue is resolved as non-Presto clusters do not validate the Presto versions now.
5th Feb, 2020 (5:41 PM PST) 58.0.40 Bug fix PRES-3412: Fixed a bug in the Ranger plugin that occurred while using Ranger user groups with Presto 317 (beta).
3rd Feb, 2020 (11:13 PM PST) 58.0.38 R58 Upgrade (Phase 1 frontend)
30th Jan, 2020 (03:53 AM PST) 57.0.111 Bug fix

AD-3111: Access to the get_creds API is now restricted to only the credentials of the default bucket.

PRES-3249: Fixed UnsupportedOperationException encountered with some complex outer join queries when dynamic filtering is enabled.

PRES-3395: Fixed a bug that caused some users to not see logs for their Presto queries that failed fast (for example, due to syntax errors).

20th Jan, 2020 (06:28 AM PST) 57.0.106 Enhancement AN-2201: Workbench displays columns in the same order as that of the describe <table>. Clicking on Name sorts them in the ascending order. Beta
Bug fix ZEP-4275: Downloading dependencies in notebooks failed because maven stopped support for http. This issue is fixed and now all the references to maven use https.
16th Jan, 2020 (03:17 AM PST) 57.0.103 Enhancement

ACM-5876: Qubole now supports m5n, m5dn, r5n, and r5dn instance types.

ACM-6032: Qubole now supports G4 instance types.

ACM-6165: Qubole now supports r5d.8xlarge, r5d.16xlarge, m5d.8xlarge, and m5d.16xlarge instance types.

Bug fix

QHIVE-5051: Fixed the issue when reading columns statistics fails for a column of date type in a partitioned table. Qubole has backported open-source fix, HIVE-20098 to all Hive versions.

QTEZ-497: Fixed the issue which caused very large Tez DAG submission to fail with com.google.protobuf.CodedInputStream Exception. Related open-source issue: TEZ-3784.

10th Jan, 2020 (03:46 AM PST) 57.0.100 Enhancement EAM-1801: Bastion Support for the snowflake data store is introduced so that the users can use bastion nodes to whitelist IP and ensure more security (Disabled, Via Support, Cluster Restart Required).
19th Dec, 2019 (02:59 AM PST) 57.0.98 Enhancement

AD-471: While whitelisting an IP address, you can now add a description as required. The description can contain a maximum of 255 characters. It is supported through the UI and API as well. For more information, see Whitelisting IP Addresses and Add a Whitelisted IP Address.

AD-645: Qubole has introduced a new monthly Qubole Compute Unit (QCU) API and it has deprecated the old QCUH and Monthly Usage API. For more information, see View your Qubole Compute Unit (QCU) Usage.

PRES-3234: Presto 317 (beta) now supports AWS Glue metastore.

Bug fix

PRES-3157: Qubole has fixed the MANDATORY_PARTITION_CONSTRAINT rule of Strict Mode in Presto 0.208 to allow queries, which use a predicate expression on any partitioned column while scanning a partitioned table.

PRES-3240: The optimization to finish join tasks early if their probe side is empty, has been removed as it deadlocks the query execution if new nodes join the cluster while the query is executing.

PRES-3282: Presto has added support for Lambdas in ExpressionEquivalence. This is a backport from the following open source commit.

18th Dec, 2019 (11:15 PM PST) 57.0.94 Enhancement

AN-2078: While selecting a cluster for a Hive, Presto, or Spark command, you can view its memory and CPU usage and its Hive metastore connectivity. Reviewing this information helps you make an informed decision on which cluster to choose for submitting commands. Beta

AN-1584: Workbench displays clusters sorted by Up, Pending, Terminating, and Down. Within each set, cluster labels are sorted alphabetically. Beta

AN-2348: Workbench now supports the read-only view for Spark notebook commands. You can use these to view logs, results, and resource links for the selected command. Beta

AN-2275: The command composer can now be resized for Spark and Shell commands. Beta

Bug fix AN-2261: Fixed an issue that sometimes caused the command preview text to not display properly if a keyword search was performed. Beta
2nd Dec, 2019 (12:35 AM PST) 57.0.81 Bug fix ACM-6109: The issue in which the cluster start failed due to the cluster settings’ file upload failure to SSE-KMS enabled buckets has been resolved.
26th Nov, 2019 (11:40 PM PST) 57.0.80 Bug fix ACM-6044: The issue in which using qubole-bash-lib.sh in the node bootstrap script threw /usr/lib/hustler/bin/qubole-bash-lib.sh: QUBOLE_BASH_LIB: unbound variable error has been resolved.
22nd Nov, 2019 (07:44 AM PST) 57.0.79 Bug fix ACM-5991: The issue in which a cluster containing spot nodes in its composition returned add_spot_fleet_using_instance_weight_info_list_sync - Could not make spot fleet request: error has been resolved now.
21st Nov, 2019 (05:04 PM PST) 57.0.77 Enhancement

AD-2805: You can now validate an account using AWS simulation. This helps in identifying any missing permissions.

ZEP-493: Bitbucket is integrated with notebooks. You can use Bitbucket to manage versions of your notebooks. Learn more.

20th Nov, 2019 (02:21 AM PST) 57.0.72 Enhancement

AIR-404: The users can now provide custom on_failure_callback for QuboleCheckOperator just like QuboleOperator. It allows users to define a custom callback if the QuboleCheckOperator fails (Cluster Restart Required).

QUEST-385: Users must create Spark Streaming clusters to run Spark structured streaming pipelines at scale with the Quest features.

QUEST-443: Users can use the new Filter option in the Pipelines view to filter the Pipelines based on their state.

QUEST-435, QUEST-378, QUEST-377, QUEST-403: Users can now edit, clone, archive, and delete pipelines depending on the State of the pipelines.

QUEST-406: Users can now modify name of the pipeline, when the pipeline is in the Draft state.

QUEST-405: Users can add Amazon S3 as a source when creating streaming pipelines.

QUEST-413, QUEST-454: Users can view the health of the streaming pipelines in the left pane of the Pipelines view.

QUEST-334: Users can fetch schemas from the Kafka cluster. For more information, see https://github.com/confluentinc/schema-registry.

New feature PRES-3070: Presto version 317 (beta) is now available on the Presto cluster.
Enhancement

PRES-2990: Improved efficiency of dynamic partition pruning by preventing listing and creation of Hive splits from partitions, which are pruned at runtime.

PRES-3112: Qubole has introduced a feature to enable dynamic partition pruning on Hive tables at account level. This feature is part of Gradual Rollout.

Bug fix

PRES-3051: It fixes the Invalid partition value exception and intermittent ArrayIndexOutOfBoundsException exceptions from queries with dynamic filtering enabled.

PRES-3113: Improved accounting of queued work for calculation of optimal size in autoscaling.

PRES-3131: Qubole has added http-server.log.max-size=1GB in config.properties to ensure that the HTTP request log is rolled over regularly if the file size reaches 1GB.

PRES-3163: It fixes a bug which could cause some additional nodes to be added to the cluster for a short duration during spot loss.

PRES-3187: It fixes a bug that caused the Download as CSV file option on Analyze UI to incorrectly download the file as Tab separated.

14th Nov, 2019 (11:40 AM PST) 57.0.64 Enhancement ACM-5958: Qubole has done improvements in cluster UI page’s loading to resolve the slow-loading cluster UI page.
Bug fix

ZEP-4081: Notebooks were failing due to jar conflicts with Spark. Thrift version used by Zeppelin is upgraded to 0.9.3 to fix this issue.

ZEP-4119: Python environment for the Notebooks was not getting configured properly. This issue is fixed now.

24th Oct, 2019 (4:45 AM PST) 57.0.50 Bug fix ACM-5798: This is a bug fix for warning notifications that had timestamp as timeout value. To resolve this, Qubole provides a configuration to control the warning notifications for a given cluster. To know more, see Configuring Query Runtime Settings.
21st Oct, 2019 (8:42 PM PST) 57.0.48 Bug fix

ACM-5937: It is a bug fix for spot rebalancing that was being triggered incorrectly, which may result in addition of 10% more nodes than the maximum cluster size configuration.

HADTWO-2185: The issue where jobs were stuck after switching to the s3a filesystem has been resolved.

10th Oct, 2019 (03:46 PM PST) 57.0.40 Bug fix ACM:5866: The issue in which a cluster could not be started due to lack of disk space is resolved now. Qubole has increased the root disk size from 72GB to 90GB for all cluster types. This increases the cost by 1.8 USD per cluster node per month.
6th Oct, 2019 (11:13 PM PST) 57.0.35 R57 Upgrade (Phase 1 frontend)
30th Sep, 2019 (06:29 AM PST) 56.0.159 Enhancement INFRA-2441: The default Account Level Concurrent Command Limit on the Account Settings tab has increased from 20 to 100.
27th Sep, 2019 (01:13 AM PST) 56.0.158 Enhancement PRES-2791: Qubole has ported open-source changes that are related to improvements in S3 reads to Qubole Presto 0.208 version. For more information, see Faster S3 reads.
Bug fix

PRES-2961: In an IAM-Role-based account, the issue where incorrect IAM-Role info had been sent to Presto has been resolved now.

PRES-2999: The NullPointerException when local memory limits are exceeded and a leak in operator peak memory computations in Presto version 0.208 queries have been resolved now.

PRES-3009: The issue where the Presto coordinator disk was filling up due to presence of RubiX logs in the autoscaling log file has been resolved. rubix.log is excluded from autoscaling logs.

24th Sep, 2019 (08:43 AM PST) 56.0.155 Bug fix ACM-5664: Fixed the caching issue of the cluster list in the Analyze (Workbench (beta) UI page.
23rd Sep, 2019 (11:43 PM PST) 56.0.154 Bug fix ACM-5616: Status checking of Spot requests has been made more robust by doing retries for DescribeSpotInstanceRequests in case of the RequestResourceCountExceeded error.
    Bug fix QHIVE-4798: Fixed an issue which can lead to a memory leak in HiveServer2 JVM when a large number of concurrent applications are running on a given cluster.
10th Sep, 2019 (12:59 AM PST) 56.6.11 Bug fix ACM-5689: Fixed an issue where the default Hadoop cluster did not start during the AWS test drive signup.
03rd Sep, 2019 (06:08 AM PST) 56.1 Enhancement

ACM-5266: Qubole supports i3en.large, i3en.xlarge, i3en.2xlarge, i3en.3xlarge, i3en.6xlarge, i3en.12xlarge, and i3en.24xlarge instances.

ACM-5413: Qubole supports m5.8xlarge, m5.16xlarge, m5a.8xlarge, m5a.16xlarge, r5.8xlarge, r5a.8xlarge, r5a.16xlarge, and r5.16xlarge instances.

ACM-5558: During provisioning of nodes on a running heterogeneous cluster, Qubole tries providing instance types set in the heterogeneous configuration before falling back to On-Demand instances. For more information, see Additional Permissions.

Bug fix

ACM-5532: During the scrub run that is removing dead nodes for a cluster, there were nodes that were unreachable. This resulted in performance issues. To resolve this issue, Qubole terminates such nodes that cannot be connected. Via Support.

HADTWO-2098: In some cases, the scrub run did not remove dead nodes from a cluster. It happened as the scrub run connected to dead nodes as well through SSH to check on the node bootstrap completion. To resolve this issue, Qubole has improved the scrub run process, which does not connect to a dead node registered in the ResourceManager. This helps removing dead nodes that remained in the cluster and prevents such dead nodes from remaining in the cluster unlike before.

22nd Aug, 2019 (10:41 PM PST) 56.0.120 Enhancement JUPY-1: Qubole provides JupyterLab interface, which is the next generation user interface for Jupyter, to create and manage Jupyter notebooks. Jupyter notebooks are supported on Spark 2.2 and later versions. Beta, Via Support. Learn more.
21st Aug, 2019 (11:55 AM PST) 56.5.1 Enhancement

AN-2168: New Analyze is now Workbench. Beta.

AN-1814: You can resize the command query composer in Workbench for Hive, Presto, Quantum, and DB Query commands. Beta.

AN-1324: Cluster live health metrics are now available as part of the Clusters drop-down list in Workbench. Via Support. Beta.

AN-1327: To make debugging easier, Qubole now displays the Cluster Instance ID under the Processing tab of the Status pane. This enables you to collect logs of the particular command by cluster instance. Beta.

AN-2210: You can now tag commands on the History tab in Workbench. You can later use these tags to filter out commands using the Tags field (in the history filter). Beta.

Bug fix AN-2219: Resource links are now clickable in the Logs pane in Workbench. Clicking the link redirects the user to the corresponding cluster dashboard.
09 Aug, 2019 (6:41 AM PST) 56.0.112 Bug fix AN-2240: The cluster selection drop-down list in Workbench now displays Hadoop2 clusters.
06 Aug, 2019 (03:36 AM PST) 56.0.108 Enhancement AIR-390: Now Airflow is supported on New Package Management. New Package Management brings in features like Python 3.7, a new version of Conda, and a lot of fixes on package installation such as support for no-arch packages, and so on (Cluster Restart Required, Disabled, Via Support).
20th Aug, 2019 (4:43 AM PST) 56.0.117 Enhancement ZEP-3373: Users can now edit the notebooks even when the clusters are offline. Via Support
Bug fix ACM-5344: Qubole supports configuring proxy setting of Internet proxy server and no_proxy settings for cluster nodes. To override them, contact Qubole Support. Via Support
06 Aug, 2019 (3:36 AM PST) 56.0.108 Enhancement

ZEP-2717 and ZEP-3602: The Environments UI is now available in the Control Panel by default for the new users.

Limitation: The packages that are installed by default cannot be uninstalled in new version of Package Management.

31st July, 2019 (06:09 AM PST) 56.0.102 Enhancement AD-2476: To mitigate risk while rolling out a new feature, Qubole groups users, and accounts into different pods. When a change is rolled out to a pod, Qubole monitors the feature’s performance before rolling it out to subsequent pods. You can view the pod you belong to in the Account Details section of the Account Settings tab.
30 July, 2019 (09:50 AM PST) 56.0.100 Enhancement SCHED-376: SQL Command type has been renamed as Quantum in the Scheduler UI’s command type drop-down list.
29th July, 2019 (2:29 PM PST) 56.0.99 Bug fix

QHIVE-4662: Fixed an issue which caused recursive listing while dropping partition(s) on a managed Hive table. Related OSS Jira: HIVE-22054.

QTEZ-443: Fixed an issue in which Tez UI was unable to download LevelDB file (that contains timeline data/log) for application when the IP address of a cluster node in a private subnet got repeated.

QTEZ-450: Fixed the issue in which Reducer was not visible in Tez UI when the configured Tez version was 0.8.4.

19th July, 2019 (9:37 AM PST) 56.0.92 Enhancement PRES-2918: A new experimental configuration property called experimental.reserved-pool-enabled is added to Presto version 0.208 to allow disabling Reserved Pool, which is used to prevent deadlocks when memory is exhausted in the General Pool by promoting the biggest query to Reserved Pool. However, only one query gets promoted to Reserved Pool and queries in General Pool get into the blocked state whenever it becomes full. To avoid this scenario, you can set experimental.reserved-pool-enabled to false for disabling Reserved Pool. For more information, see Disabling Reserved Pool.
Bug fix

PRES-2746: S3 buckets were inaccessible with IAM Roles configured on the Qubole account. To resolve this issue, Presto has added hive.s3-secondary-role-arn and hive.s3-secondary-role-extid. You can add the ARN and External ID of the user-overridden (secondary) IAM Role that can access S3 buckets, in Hive catalog properties. For more information, see Hive Catalog Properties associated with AWS.

PRES-2797: Fixed the issue in the generated Presto Query Tracker when Presto version was changed on an active Presto cluster.

PRES-2856: Fixed the issue in which command results were displayed without column headers for Presto queries when Qubole drivers executed such queries in Presto FastPath.

PRES-2915: Fixed the issue in which a Presto cluster with idle cluster timeout configuration did not automatically terminate even when it was idle for a longer time.

9th August, 2019 (6:41 AM PST) 56.0.112 Bug fix ACM-5171: Fixed the issue where a multi-instance HS2-enabled Hadoop (Hive) cluster, the multi-instance HS2 intermittently failed to start at the first attempt while waiting for the Hadoop (Hive) cluster.
4th July, 2019 (1:25 AM PST) 56.2.3 Enhancement Hive version 2.3 is generally available.
Bug fix

QHIVE-4385: To resolve FileNotFoundException while calculating FileSplits for the ORC file format, Qubole has added retries in the Tez AM configuration. You can configure number of retries by using hive.qubole.handle.s3.stale.listing.retries.split.generation, which defaults to 10. This configuration minimizes query failures due to inconsistency in S3 listing.

This is an enhancement over QHIVE-3675, which handled FileNotFoundException that occurred while processing a specific FileSplit. Via Support

20th June, 2019 (4:51 AM PST) 56.2.1 Enhancement ACM-4221 and ACM-5016: Qubole supports Hive 3.1.1 (beta) on a Hive cluster. Starting cluster API v2.1, Hadoop 2 (Hive) clusters are renamed as Hive clusters. You can set Hive 3.1.1 (beta) version while creating/editing a cluster. Qubole supports creating Hive clusters only from cluster API v2.1 onwards. Via Support
13th June, 2019 (11:48 AM PST) 56.0.76 Enhancement

AN-1639: The Status pane on the new Analyze page is now available for Quantum commands.

AN-1708: Permalinks for Hive tables on the new Analyze page now contain account IDs. When you navigate to a Hive table from a different account, a confirmation dialog box appears.

QUEST-321: Users can now use custom code in Python language for creating a streaming pipeline.

QUEST-332: Users can select AVRO as one of the input formats when creating a streaming pipeline in assisted mode with Kafka as source.

QUEST-341: Users can add additional configurations such as user defined metadata for data written to S3 sink.

SPAR-3514: The Test Run option in the QuEST UI, processes a limited number of records and maintains separate temporary checkpoint location to prevent any corruption to the runtime production checkpoint.

SPAR-3591: Users can pass x-amz-meta-metadata(key1=value1,key2=val2) while creating a new streaming job with s3 as the sink by setting the option fs.s3a.user.metadata as key1=val1,key2=val2. The streaming application creates new files with this metadata. This is supported on Spark 2.3.2 and later versions.

TOOLS-1440: The `s3cmd` version is upgraded from 1.5.2 to 2.0.2.

Bug fix

AN-1974: Column headers now appear on the Analyze page for large Hive queries (approximately 65KB and more).

QUEST-340: When the users performed a test run with Kinesis as source and IAM role enabled in the account, the test run failed. This issue is now fixed.

QUEST-324: If the checkpoint location is not unique, the QuEST UI displayed an inaccurate notification when running the streaming pipeline in assisted mode. Now, the accurate notification is displayed in this case.

10th June, 2019 (8:55 AM PST) 56.0.67 Enhancement

AD-1629: Qubole now supports IAM role-based account creation via API for AWS. Learn more.

QHIVE-4527: Qubole will deprecate Hive notebooks in the near future.

Bug fix

AD-2554: This bug fix resolves the intermittent provided token has expired error for AWS.

AD-2441: The Usage Status Dashboard has been revamped.

QTEZ-440: It is the fix for the issue in which a Tez DAG got hung when exceptions were uncaught during a DAG transition.

4th June, 2019 (11:07 PM PST) 56.0.63 Enhancement AN-2100: All file downloads on the Explore page are now handled through AWS Signature Version 4.
3rd June, 2019 (1:39 AM PST) 56.0.61 Enhancement

JDBC-124: Qubole now supports concurrency of multiple statements in Presto FastPath.

PRES-2254: In a Presto notebook, you can now set zeppelin.presto.stacktrace as an interpreter property for displaying stacktrace for certain errors.

PRES-2600: These are the new enhancements in the Presto notebooks:

  • You can now set session properties in Presto notebooks in a paragraph and run it. When set, these session properties are applicable to paragraphs in the notebook’s current session.
  • In Presto notebooks, for improving debugging experience, the source field is set as notebook_<notebook-name>_<notebook-id> and in the dashboards, the source field is set as dashboard_<dashboard-name>_<dashboard-id>_<source-note-id>. A source field is directly searchable in the Presto UI. For example, in the Presto UI, you can search a notebook by its name or ID to quickly filter queries, which are run from that specific notebook while debugging an error.
Bug fix

PRES-2515: The issue which caused ArrayIndexOutOfBoundsException for some queries when using Dynamic Filtering is resolved now.

PRES-2727: This fixes the issue where queries with a table containing sub column names of a struct column type starting with numbers failed. Sub-column names of a struct column type can now start with numbers.

PRES-2775: This fixes a bug which could prevent an upscaled Presto cluster from downscaling if short running queries are regularly scheduled on the cluster.

24th May, 2019 (12:42 PM PST) 56.0.57 Bug fix ZEP-3659: The newly added paragraphs were displayed only after refreshing the page. This issue is fixed.
20th May, 2019 (7:59 AM PST) 56.0.48 Enhancement AN-1382: Click the preview icon to preview a query. You can also dock the query preview while you work on another query.
15th May, 2019 (11:44 PM PST) 56.0.45 New Feature SQOOP-183: On Failure of DbImport/DbExport Commands running on a pixie cluster, the user can see failed MR (Map Reduce) job logs in the command logs section.
10th May, 2019 (10:14 AM PST) 56.0.42 Enhancement QTEZ-441: Qubole has added an enhancement which when enabled adds an application tag containing the Qubole command ID to Tez jobs submitted through Hive-on-master and QDS servers. This tag cleans/kills applications for which corresponding queries are killed. Create a ticket with the Qubole Support to enable it on the QDS account.
9th May, 2019 (10:04 AM PST) 56.0.40 Bug fix ACM-5035: This resolves a bug where accounts using different AWS Keys (Access and Secret) for storage and compute credentials were not able to start the cluster due to the Access denied error in copying the node bootstrap. It also resolves cluster start failures for the Qubole account which are not configured to use the SSE algorithm but the default S3 location has enforced the S3 operations to use the encryption.
8th May, 2019 (5:59 PM PST) 56.0.38 Enhancement PRES-2769: jdk.nio.maxCachedBufferSize and ExitOnOutOfMemoryError JVM configuration properties are pulled from the default JVM configuration of Qubole Presto 0.180 and later versions to improve stability. the open source into For more information, see jvm.config.
7th May, 2019 (2:28 AM PST) 56.0.31 R56 Upgrade (Phase 1 frontend)