Census Activates Data Quality Improvements with Metaplane

“Census helps me amplify analytics engineering work by ensuring data is used for more than just reporting. Metaplane helps me move faster, more confidently, and with greater trust from stakeholders."

Julie Beynon
Head of Data
Census is the first Data Activation platform built on your warehouse.
https://www.getcensus.com/
Industry
Reverse ETL
Size
180 employees
STACK
Census
Snowflake
dbt
Fivetran
Metaplane

Learn how Julie, Head of Data at Census, thinks about establishing a data program and the role that Metaplane has to play.

There are two things that will stand out to you in this case study:

  • Census - This is probably a tool that you’re either using right now, or considering using in the near future. Census pioneered the Reverse ETL category to sync data from your data warehouse into your company’s SaaS applications. Data teams at Canva, ClickUp, and Genesis Motor use Census to personalize marketing campaigns, prioritize sales leads, and automate countless other business processes. 
  • Julie Beynon - If you think you’ve seen this name before, you’re right (this is her second case study). Now in her second Head of Data role, she’s the first dedicated data hire at Census, overseeing all analytics responsibilities.

How data is used at Census

If you were to join Julie’s team today - the first thing you would notice when you query Census’s Snowflake instance is probably the fact that a database is named after Cholula, after the Great Pyramid of Cholula (not the hot sauce). Immediately after that, you’d realize that the data team owns analytics for applications across the entire company, covering: 

  • Marketing analytics - using data from sources such as Facebook Ads, Hubspot, and LinkedIn Ads
  • Product analytics - with product usage data stored in their PostgreSQL instance
  • Revenue analytics - using data from their CRM, Salesforce, and their billing system, Nue.io

Julie shared this piece of insight when it comes to setting up a data stack:

I wouldn’t start a new job if they didn’t have or weren’t open to having Census, dbt, and Metaplane. After getting a warehouse, I recommend setting up dbt to model your data and immediately answer ad-hoc questions. After that, I’d implement Census to directly feed the data into business applications to minimize downtime between insight to decision and I’d layer Metaplane over all of this to make sure that we were only activating with accurate data.

Experience with Metaplane

Julie had previously found success with Metaplane’s free tier and by moving to a paid plan at Census, was able to continue growing trust in data faster than she had previously been able to.

Finding data quality issues

Metaplane uses machine learning to find anomalies in your data. When incidents are found, alerts are sent to a collaboration tool such as Slack, Microsoft Teams, or PagerDuty, to fit within common workflows. Julie and the team at Census focus on two groups of issues:

  • Schema Changes - In a more recent case, the Operations team at Census added new custom fields in Salesforce, which alerted Julie to outline a plan of action for accommodating these new fields in ongoing analytics.


In other words, schema change alerts helped bridge the gap between business operations teams and downstream business stakeholders, while educating everyone (in the slack channel) on data dependencies.

  • Anomalous data updates - Through a combination of monitors finding issues with freshness, row count, and dbt job durations, Julie and the broader Census team are able to find the majority of issues at the moment that they occur. Freshness and row count monitors are heavily utilized in the Cholula database mentioned at the beginning of this piece, which is also where their dbt job outputs are materialized. By coordinating these different types of monitors along the dbt outputs, they’re able to identify any production issues that would impact business stakeholders relying on this modeled data.
Metaplane’s freshness, row count, and dbt job duration monitor alerts give us early indicators of data issues. Most perceived data incidents stem from stale or partial data, caused either by ingestion or modeling errors. By using Metaplane, we can often ‘intercept’ bad data and alert stakeholders before the next time they log into the business applications that Census is feeding data into.

Improving data stack ROI

Finding anomalous dbt job durations eliminates one troublesome aspect of dbt implementation not related to actual modeling, encouraging further usage, but what about the rest of the data stack?

As Julie continues to build the data strategy, they’ve also been able to use Metaplane to:

  • Move faster to analyze new datasets - In addition to Census data activation feeds, the team also uses Omni to find additional business context. Julie has recently been onboarding additional schemas and tables for data that historically have already been used in Omni, requiring her to adjust object references in her queries. By using Metaplane’s column level lineage, she is able to quickly understand how tables are used downstream to ensure a smooth cutover.
I’ve been able to make changes so much faster by having the context of upstream and downstream dependencies at my fingertips.
  • Optimize Snowflake spend - As the first hire for all-things-data, Julie was also tasked with reviewing Snowflake spend to make sure that historical artifacts were taken care of, and credits were used wisely. To do this, Metaplane’s Snowflake Spend Analysis feature was used to help Julie understand where credits were most being heavily used in frequent jobs, compute-intensive queries, and even service users.
The real-time notification feature for unexpected high or low spend is particularly useful, as it allows for immediate action and adjustments. This proactive approach to managing Snowflake usage not only helps in controlling costs but also ensures that our data infrastructure is efficient and well-suited to our organization's needs.

How Census uses the Census integration

When an issue is discovered, Census uses the brand new Census integration to understand how their own Census syncs are impacted. Using Metaplane’s automatically generated column-level lineage graphs, Julie and the team are able to understand which Census destinations are downstream of any data quality issues.

I’m able to immediately see which business applications are affected and use that to notify our stakeholders that we’re looking into resolving their issue. It’s not only helped us understand the impact immediately, but also how to better prioritize resolving issues based on the workflows that were affected.

Issue Prevention

As you can probably infer, a burgeoning data strategy building off of an existing data stack probably merits a lot of changes, which is directly reflected in the volume of dbt-related pull requests. With Julie being relatively new to the data team, she mentions that it’s been useful to have Metaplane’s Data CI/CD feature forecasting what downstream changes would occur as a result of updates to those dbt models.

Having the Metaplane app directly in Github sitting next to my dbt PR has been so useful to help me prevent making changes that I didn’t want to. By seeing the deltas in values of dependent models, I can proactively go to stakeholders, educate them on the changes I’m making, and identify what impact it’d have on their work. It’s a necessary step to retain the trust that the company has in data.

The future of Census’ data program

As a data technology company, data isn’t only something that Census cares about for its customers. It sits at the beating heart of the business. 

As Julie implements more reporting and data activation workflows, it becomes increasingly important for the team to ensure that data is clean. 

Census helps me amplify my analytics engineering work by ensuring data is used for more than just reporting. Metaplane doesn’t only ensure I avoid regressing on that impact – it helps me move faster, more confidently, and with greater trust from my stakeholders.

More customer stories

How Bluecore Reduced Integration Issues by Over 75% with Metaplane
Rinsed washes away data quality issues with Metaplane
Veronica Beard sets up a data stack (almost) as good looking as their clothes
Parachute Home Makes Their Data as High Quality as their Bedding with Metaplane
Dribbble designs a data quality solution with Metaplane
Metaplane keeps data right for Upright
Metaplane propels LogRocket's data quality forward
How Clear Street uses Metaplane to Prevent $100M+ Worth of Data Quality Issues
How Sigma uses Metaplane to track impacts to Sigma Workbooks
Vivian Health Improves Data Quality with dbt and Metaplane to Connect Healthcare Professionals with their Dream Job
Gorgias knows their data is always accurate and gorgeous with Metaplane
How Clearbit cleared up data quality to increase customer retention and improve detection time by 3x+
How Teachable Used Metaplane and Sigma to Achieve Their Data Quality and Visibility Goals
How SpotOn reduced time to actionable data by 6x and increased data engineering contribution by 8.5x using Snowflake, dbt Cloud, and Metaplane
​​How Car and Classic’s adoption of Metaplane, Snowflake, and dbt led to a 10x improvement in report load time and ensured trust in the data
How Imperfect Foods uses Metaplane, Snowflake, and dbt to break down data silos
How Metaplane, Snowflake, and dbt help Vendr run a lean and adaptable data team
How Mux increased test coverage from 10% to 95% with just a few clicks
How Appcues reduced data quality issues by 77% using Metaplane, Snowflake, and dbt
How Reforge used Snowflake, dbt, and Metaplane to 3x their team and save 18 hours a week
Ensure trust in data

Start monitoring your data in minutes.

Connect your warehouse and start generating a baseline in less than 10 minutes. Start for free, no credit-card required.