New feature: fine-tuned regression testing for proactive incident prevention
Learn more about configuration options for dbt CI checks.
If you use dbt, you’re probably used to data analysts, business stakeholders, and even other data engineers building on all of your hard work. A new analytics initiative means new dashboards and reports. New business needs mean shiny new models derived from yours.
As a result, it can become daunting to update your dbt models, for fear of unintentionally breaking downstream dependencies. To help your teams merge those changes with confidence, we built a native GitHub application that automates CI checks alongside your pull requests. The checks include:
- Impact analyses—an outline of downstream tables and BI dashboards, workbooks, and other objects that may be impacted by your change
- Test previews—a report that shows how the data itself in your tables, dashboards, reports, and other dbt models will be affected by the change
Configure regression testing
One challenge with automated CI checks, especially for large codebases, is the execution time required to run them. For Metaplane CI tests in particular, regression tests are run against your warehouse. The more data you have, the longer it takes for Metaplane to query it and let you know about any possible downstream impacts that could trigger a data quality incident.
To help strike the right balance between speed and power, we’ve recently refined impact and test previews to help you test a more targeted set of objects and cut down on potential noise. Those configurations include:
- Filter by Metaplane tags. Specify exactly which objects Metaplane should regression test, such as your most critical “p0” tables or tables upstream of important dashboards
- Turn off checks. For example, if you’re aware that you’ll be making breaking changes to downstream tables that you’re no longer using, you may not need Metaplane’s CI checks to run.
- Ignore draft PRs. You may not want to run tests on PRs that aren’t ready for review.
- Specify where these tests should run. If you have a lot of data you’d like to test, you may decide to run the checks on a bigger warehouse.
- And more, including fine-tuned timeout options and the ability to specify the % threshold that constitutes a failure for the regression tests.
How to get started with Metaplane for data regression testing
If you already have a Metaplane account, you’ll need to connect your dbt instance and GitHub repo hosting your dbt code first. After doing so, navigate to your dbt connection and click “Edit GitHub integration” on the top right corner.
Otherwise, to get started with Metaplane, you can create an account or pick a time to learn more about data observability best practices from the team. New users can implement data quality monitoring, including these regression tests, within the hour!
Table of contents
Tags
...
...