Maintain healthy data pipelines with Metaplane and Airbyte
Metaplane now integrates with Airbyte so you can understand the health of your pipelines over time.
We’ve always said that data teams should be the first to know about data issues. That’s especially true when it comes to data ingestion pipelines, where failures can cascade far downstream.
Today, we're excited to announce our integration with Airbyte. Airbyte, the leading open-source data movement platform, enables more than 6,000 companies to sync data from 300+ structured and unstructured data sources to data warehouses, databases, and more.
More pipelines, more problems
Use cases powered by data, from BI to AI to data products, are increasingly important. As the stakes increase, the need to get ahead of common data pipeline problems is higher than ever:
- Pipeline failures. Pipelines inevitably break for different reasons – source system failures, resource contention, and invalid SQL to name a few. These unexpected failures can cause delays and disrupt data flows, breaking downstream data processes and products.
- Latency and performance issues. If a source or destination experiences performance issues and pipelines take longer than usual to finish, there is a ripple effect – every downstream job, transformation, and dashboard is showing stale data.
- Data quality issues. Sometimes data pipelines succeed and data is moved, but the volume or underlying quality of the data is compromised. These discrepancies between source and destination systems are quickly noticed by stakeholders using downstream data products.
- Unknown downstream impact. Issues happen. But once you identify an issue, the two questions you ask yourself are: What does this impact? And who needs to know about this? Too often, it’s hard to find the answers to these questions.
Airbyte observability, unlocked
To make their platform scalable, extensible, and reliable across all these use cases, Airbyte has invested heavily in making it observable through rich metadata available via an API.
Through deep integration with the Airbyte API, proactive monitoring with Metaplane’s Data Observability platform solves these common pipeline issues to give data teams peace of mind about their data.
Metaplane's integration with Airbyte takes minutes to set up. After connecting, Metaplane will populate with all of the Airbyte Workspaces, Connections, and Streams that it has access to.
But that’s just the beginning: Metaplane provides out-of-the-box monitoring of important metrics and automatically syncs lineage so you can understand the health of your pipelines over time.
Monitoring Airbyte metrics with machine learning
Metaplane then automatically ingests metrics about the syncs that Airbyte ran for each Connection including:
- Last succeeded sync timestamp to ensure pipelines are up and running
- Duration of syncs to monitor latency issues if syncs take too long to complete
- Number of rows synced and bytes synced to catch quality of data volumes loaded
Metaplane maintains a historic log of these metrics so you can look back over time. Then, we automatically train machine learning models on that historical metadata that take trends, seasonality, and user feedback into account.
The end result is that you’re proactively alerted to anomalies without having to write any code or toil with configuration. Metaplane-Airbyte users can catch pipeline issues like long-running syncs, late-arriving data, and successful syncs that did not move the right amount of data from source to destination.
Automatic stream-to-destination lineage
In addition to metrics, you'll instantly see both upstream and downstream warehouse lineage for Streams. If you’re loading data from a database, we’ll include the table that is upstream of a stream. If you’re syncing into a warehouse or database, we’ll create a link between your stream and the table loaded by that stream.
Now you have full end-to-end lineage from source to destination, whether that’s a BI dashboard or a reverse ETL sync. This helps teams build awareness of data – where it comes from, and how it’s used – as well as the potential root causes and downstream impact of data incidents.
Metaplane and Airbyte are better together
Ready to get started? Sign up for Metaplane for free and start monitoring and pulling in lineage from Airbyte in minutes.
Metaplane supports monitoring both Airbyte cloud and open-source versions. For more information and support, visit our Airbyte docs to learn how to integrate using username/password authentication.
Table of contents
Tags
...
...