Announcing our Airflow Integration: Observability into DAGs
Monitor DAGs and Tasks for long runtimes, identify root causes for data incidents using end-to-end lineage, and have a single pane of glass to view the health of your data pipelines in one place.
Airflow is the tool of choice for building data pipelines for many data teams. Its ability to easily orchestrate tasks and dependencies help teams ingest data from source systems, run dbt jobs, as well as run transformations such as cleaning data, aggregating data, and modeling business logic.
But if you’ve worked with Airflow at scale, you've run into common challenges including:
- Latency increases caused by inefficient queries or lack of compute resources
- Difficulty in identifying root causes of data incidents that occur downstream of Airflow jobs
- Having one place to see your entire data platform, including lineage from Airflow all the way to your BI tools
Ultimately Airflow becomes both a source of data quality issues downstream and a hindrance to resolving them.
That's why we're excited to announce Metaplane’s Airflow integration — giving our customers an additional layer of observability into DAGs, Tasks, and lineage. By integrating with Airflow, Metaplane can now monitor DAG and Task duration for unexpectedly long runtimes, and extract the lineage of queries run through Airflow.
Find bottlenecks caused by long running Airflow jobs
Metaplane uses machine learning to automatically monitor and predict how long your Airflow jobs should take based on previous behavior. When jobs take longer than expected to complete, Metaplane will open a data incident and send alerts where your team already lives, like Slack or MS Teams. Once Airflow metadata is sent to Metaplane, setup takes minutes as the platform auto-applies duration monitors to your most important DAGs.
This addresses:
- Long-running DAGs for complex once-a-day or once-a-week transformations such as regularly batched cleaning scripts for large vendor-imported flat files
- Isolating specific tasks that you’ve known to be problematic in the past. For example, you may be dealing with a SQL operator that regularly fails due to query timeouts.
Identify root causes of data incidents using lineage
Similar to the object relationships that you see in lineage generated using APIs, metadata, and query parsing, integrating Airflow in Metaplane can also show you which queries are run as part of your DAGs.
As an Airflow user, this means that you’ll be able to immediately understand which Airflow DAGs or tasks were the root cause behind a missed update to a table or a stale dashboard.
Connecting Airflow to Metaplane
After installing the Metaplane Airflow provider, you can establish the connection either through the UI or by creating an environment variable. After you’ve established connection properties and configured your callbacks accordingly, your duration monitors will begin to receive inputs from your task and DAG durations. Keep an eye out—they’ll automatically alert you to any spikes you might see in the future.
Get started today if you don’t already have a Metaplane account! Implementation, including monitor configuration and data stack integration, takes no more than 30 minutes to get set up. If you run into any questions, please don’t hesitate to reach out to our team.
Table of contents
Tags
...
...