Monitoring an Airflow Cluster¶
You can monitor an Airflow cluster by using the Airflow Web Server and Celery Web Server. The web server URLs are available in the Resources column of a running Airflow cluster.
Qubole supports monit within an Airflow cluster to monitor and automatically start the Webserver, Rabbitmq, and Celery services in case of a failure.
Monitoring through Airflow Web Server¶
In the Clusters tab, from the cluster Resources of a running Airflow cluster, click Airflow Web Server. The Airflow Web Server is displayed as shown in the following illustration.
Click the Qubole Operator DAG as shown in the following figure.
Click a command from the chart and you can see the link, Goto QDS as shown in the following figure.
Qubole Operator tasks are linked with a QDS command. Following features are available to facilitate linking:
- Goto QDS: An external link pointing to corresponding to QDS command while visualizing Qubole Operator tasks of a DAG run in the web server.
- Filtering Airflow QDS Commands: Any QDS command triggered through the Airflow cluster contains three tags:
run_id. These can be used to filter QDS commands triggered from Airflow at various levels (dag/task/particular execution).
Monitoring through the Celery Dashboard¶
In the Clusters tab, from the cluster Resources of a running Airflow cluster, click Celery Dashboard to monitor the Celery workers. The Celery server runs on the 5555 port.
Monitoring through Ganglia Metrics¶
When Ganglia Metrics is enabled, you can see the Ganglia Metrics URL from the cluster Resources of a running Airflow cluster. The dashboard shows system metrics such as CPU and disk usage.
Monitoring through Logs¶
You can monitor an Airflow cluster using the following types of logs:
- Airflow logs: Airflow DAG logs are now moved to
/media/ephemeral0/logs/airflow/dags, and a symlink is created to the old location, which is
$AIRFLOW_HOME/logs. As a result, the local disk space is not consumed by the logs.
- Airflow services logs: Logs for services such as scheduler, webserver, Celery, and so on are under /media/ephemeral0/logs/airflow.
- Airflow logs (remote): All Airflow logs are uploaded to the remote storage location provided in the account. These logs can be found
Monitoring through Prometheus¶
You can monitor an Airflow cluster through Prometheus and Grafana based monitoring which comes pre-installed with the cluster. To enable it for your clusters, please contact Qubole Support. For more information, see https://grafana.com/grafana.
Airflow Clusters also expose the endpoint which can be consumed by any Prometheus Instance to scrape metrics. To access
the endpoint, hit
https://<env>.qubole.com/airflow-webserver-<cluster-id>/admin/metrics/ with the Qubole Auth Token
added to the headers. To know more, see Using Custom Headers with Prometheus.