Airflow

New Features

  • AIR-278: To monitor the running DAGs on Airflow, Prometheus is integrated for real-time monitoring and dashboard using Grafana (Cluster Restart Required). This integration helps to:

    • Monitor a number of parallel tasks
    • Monitor the individual tasks and their performances
    • Monitor the memory consumption
    • Monitor the errors and failures
    • Configure the alerts
  • AIR-279: You can use your own setup of Prometheus and get metrics when you hit https://<env>.qubole.com/airflow-webserver-<cluster_id>/admin/metrics/ with the Auth Token from your own Prometheus setup. Cluster Restart Required

  • AIR-263: Any changes in the DAG Explorer will now immediately reflect on the Airflow cluster after the changes are saved. Cluster Restart Required

  • AIR-77: Qubole Airflow clusters now support the Airflow REST API.

  • AIR-202: Apache Airflow v1.10.0 is now supported on QDS. While creating an Airflow Cluster, you can set the new Airflow version using the Airflow Version drop-down on the cluster UI. Apache Airflow 1.10.0 brings a lot of new functionalities such as timezone support, performance optimisation for large DAGs, Kubernetes Operator and Executor, and so on. A complete changelog is available for you here: Changelog. Apache Airflow 1.10.0 also provides a web interface with the Role-Based Access Control (RBAC). However, it is not yet supported in QDS. If you are using MySql or MariaDB as a database backend for your Airflow cluster, timezone support is not available due to limitations in these database systems. To use Airflow v1.10.0, you must create a new cluster.

Enhancements

Bug Fixes

  • AIR-277: The default cluster data store for Airflow clusters is changed from MySQL to PostgreSQL. Cluster Restart Required
  • AIR-266: Now, postgres python driver is pre-installed within the Airflow clusters. It makes the PostgreSQL data source configuration easy for the cluster. Cluster Restart Required