A guide to dbt test types
Understanding the different types of dbt tests can be confusing. We break it all down in this post, telling you the different types of dbt tests, what they do, and when/how to use them.
Since breaking onto the scene, dbt has helped data teams take the guesswork out of your data pipeline—giving them the tools to confidently answer the big question: “Can I trust this?”
dbt testing can be a bit confusing to get started with, though. You’ve got generic tests, custom tests, unit tests, and sub-sections of each of those. And what’s the deal with pre-built packages? How do you decide which ones to use and where they fit in your workflow?
This guide breaks it all down. We’ll walk through the different types of dbt tests, where they shine, and how to make them work for you.
Let’s dive in.
Types of dbt tests
The hierarchy of dbt tests can be confusing. That’s because there are two parent types of categories, then sub-categories within one of those parent types. We’ll aim to clear the air here.
Testing in dbt generally falls into two broad categories:
Data tests: Focus on ensuring the accuracy, consistency, and completeness of your data. Think of these as guardrails that help you enforce data quality expectations.
Unit tests: Focus on validating the logic of your models or macros. They’re designed to test small, isolated pieces of your transformation logic using controlled, mock data.
From there, though, data tests can be further categorized into two different categories: generic tests and custom (singular) tests. Let’s dive a bit further into each of these sections individually.
dbt data tests
Data tests are what most people think of when they think dbt testing. These tests focus on validating the accuracy, consistency, and completeness of your data. Think of these as guardrails that help you enforce data quality expectations.
Data tests can be broken down further into generic tests and custom tests.
Generic dbt tests
Generic tests are pre-defined, reusable checks that you can apply directly in your schema.yml file. These are perfect for the most common data quality rules and require almost no setup. dbt provides four built-in generic tests:
• Unique: Ensures no duplicate values exist in a column (e.g., primary keys).
• Not Null: Verifies that a column doesn’t contain any NULL values (e.g., every order_id should have a corresponding value).
• Accepted Values: Ensures a column only contains specified values (e.g., payment methods like credit_card, paypal).
• Relationships: Validates foreign key relationships (e.g., every customer_id in orders exists in customers).
Here’s an example of setting up a generic test in your schema.yml file:
```sql
models:
- name: orders
columns:
- name: order_id
tests:
- unique
- not_null
- name: payment_method
tests:
- accepted_values:
values: ['credit_card', 'paypal', 'bank_transfer']
- name: customer_id
tests:
- relationships:
to: ref('customers')
field: customer_id
```
Custom dbt tests
For scenarios where generic tests don’t quite cut it, you can create custom SQL-based tests. These tests allow you to define specific validation logic, giving you full control. Custom tests live in the tests/directory and return rows that fail the validation.
For example, to check for duplicate orders by `order_id` and `order_date`:
```sql
SELECT order_id, order_date
FROM {{ ref('orders') }}
GROUP BY order_id, order_date
HAVING COUNT(*) > 1
```
When you run `dbt test`, dbt flags any rows returned by this query as test failures.
dbt test packages
Building all your tests from scratch can be daunting, but don’t worry—the dbt ecosystem has you covered. dbt has a lot of community-driven test packages that extend dbt’s functionality with ready-made solutions for common data quality checks and workflow improvements. Different tests are built for different things, too. Some can monitor your data, some can enforce best practices, and some can make your testing process more efficient.
Popular dbt test packages
- dbt-utils: This package is a bit of a swiss army knife, covering foundational tests that are applicable to most projects. Popular tests include equal_rowcount (ensures two tables have the same number of rows) and expression_is_true (validates custom conditions).
- Great Expectations: A standalone data validation framework that integrates with dbt. Great Expectations offers a highly customizable way to define, document, and execute complex data quality checks, with features like detailed error reporting and visualizations.
- dbt-expectations: Inspired by Great Expectations, this package provides a collection of pre-built, customizable tests. Examples include expect_column_values_to_be_increasing and expect_column_median_to_be_between. It’s great for adding nuanced checks without setting up the entire Great Expectations framework.
- dbt-checkpoint: A lightweight package that enforces coding standards and best practices in your dbt project. While not strictly for data tests, dbt-checkpoint helps ensure your dbt codebase is consistent, readable, and follows community conventions.
dbt unit tests
While data tests ensure the output of your pipeline is correct, unit tests validate the logic of your models or macros. They’re designed to test small, isolated pieces of your transformation logic using controlled, mock data.
For example, imagine you’re writing a complex dbt model that calculates revenue. Your business logic includes handling refunds, discounts, and taxes. If something goes wrong, how do you pinpoint the issue? Unit tests allow you to validate that each piece of logic behaves as expected in isolation before integrating it with real data.
How to write unit tests in dbt
Unit tests are all about validating that specific parts of your logic work as expected when isolated from the complexities of real-world data. While dbt doesn’t have native support for unit tests (yet!), you can simulate them using a combination of mock data, test models, and SQL assertions. Let’s walk through an example step by step.
Imagine you’re building a dbt model that calculates the final price for each order, taking into account discounts. This logic might be part of a larger data transformation pipeline, but you want to test it independently to ensure accuracy.
Step 1: Create mock data
When writing a unit test, it’s helpful to create mock data so that you can control the input data for testing. They act as stand-ins for your source tables, allowing you to work with predictable data.
Seed file: seeds/test_orders.csv
Step 2: Build the transformation model
Next, write the dbt model you want to test. This model calculates the final price for each order by subtracting the discount from the total.
Model: models/calculate_final_price.sql
```sql
WITH source_data AS (
SELECT * FROM {{ ref('test_orders') }}
)
SELECT
order_id,
order_total,
discount,
order_total - discount AS final_price
FROM source_data
```
The transformation takes the mock data from the seed file (test_orders) and applies the business logic: final_price = order_total - discount.
This logic needs to handle edge cases, like negative discounts or NULL values, correctly. Unit testing will help ensure your transformation behaves as expected.
Step 3: Write the unit test
Now, write a test to validate your transformation. The goal is to check if the calculated final_price ever falls below zero—an invalid scenario in this business context.
Unit Test Query: tests/test_calculate_final_price.sql
```sql
SELECT *
FROM {{ ref('calculate_final_price') }}
WHERE final_price < 0
```
This query selects any rows from the calculate_final_price model where final_price is negative. If the test returns rows, it indicates a failure because the business logic isn’t handling edge cases (like negative discounts) properly.
Step 4: Run the test
To execute the unit test, run the following commands:
```sql
dbt seed --select test_orders # Load the seed file into your database
dbt run --select calculate_final_price # Build the model
dbt test --select test_calculate_final_price # Run the unit test
```
If everything works, the test will pass with no rows returned. If the test fails, dbt will flag the rows that don’t meet the conditions. For example:
This failure tells you that your logic isn’t correctly handling negative discounts. You can then refine your model to address the issue (e.g., add a WHERE discount >= 0 clause).
Once you’ve identified and fixed any issues, rerun the test to ensure your changes work as expected. This iterative process helps you lock in the correctness of your transformation logic before it’s applied to live data.
How do dbt tests and data observability work together?
dbt tests excel at verifying known, static issues in your data, such as ensuring referential integrity, catching null values, or checking for duplicate primary keys. They are structured, explicit, and easy to maintain when used for targeted purposes.
But as data pipelines grow more complex, relying solely on dbt tests can lead to challenges like test suites that balloon to hundreds or thousands of tests.
This is where Metaplane’s data observability platform complements dbt. Instead of manually setting up hundreds of scalar outcome tests with hardcoded thresholds, Metaplane automatically monitors trends, detects anomalies, and alerts you when data behaves unexpectedly—all with minimal setup. It saves data teams from having to encode every potential scenario into a dbt test and provides insights into dynamic, unknown issues like unusual spikes in row counts or changes in data patterns.
Increase trust in your data with dbt tests
Testing in dbt isn’t just about preventing mistakes—it’s about building trust in your data. By leveraging a mix of data tests, unit tests, and assertions, you can create a pipeline that not only works but delivers consistent, high-quality insights.
Start small with generic tests, explore custom and unit tests as your needs grow, and don’t forget to tap into the power of pre-built packages. With the right approach, dbt testing will become one of the most valuable tools in your data engineering toolkit.
Try Metaplane today and pair your dbt tests with ML anomaly detection for better data quality.
Table of contents
Tags
...
...