How Reforge used Snowflake, dbt, and Metaplane to 3x their team and save 18 hours a week

"Because of Metaplane, I spook people with how quickly I find data issues now. It helps me understand what's going on with our data and helps me catch issues that are painful to untangle."

Daniel Wolchonok
Head of Data
350%
Team growth rate
18/week
Hours saved
Reforge is a learning and community platform that helps tech professionals do the best work of their career. Reforge connects its members to industry experts, industry leading content, and teaches these materials in cohort based courses. It has a vibrant and strong ecosystem where members connect with each other and industry experts to help talk about what they’re working on and learn from each other.
https://www.reforge.com/
Industry
E-Learning
Size
500 employees
STACK
Snowflake
dbt
Metabase
Metaplane

Challenge: Growing a data team as the demand for insights increases

Data has always played a critical role in driving Reforge’s business forward. Since day one, data was leveraged to understand important metrics like conversion rates and user engagement. As Reforge experienced rapid growth, their Head of Data, Daniel Wolchonok, wanted to make sure the data team kept up with this pace and made new metrics and insights available to the larger team.

❝Now, we have a larger number of users and all the responsibilities of a SaaS business. We’re focused on metrics like new revenue, retained revenue, expansion, contraction, and churn. Our data needs have exploded and I expect that will only accelerate.

As the number of stakeholders relying on data skyrocketed, Dan and his team ran into scalability issues with their data warehouse performance and data engineering development practices.

The first data warehouse that Dan and his team built used Postgres, a popular open source OLTP database. When more teams started using data, queries routinely timed out for different stakeholders, making it harder for them to use data. After continually increasing to more powerful instances, Dan’s team felt like they were spending more time figuring out how to optimize their Postgres resource utilization rather than on what really mattered - bringing insights to the business.

Because Reforge didn’t previously have a dedicated data team, the data stack was developed by the engineering team using software engineering tools. The data models lived in the engineering team’s GitHub repo and were managed like a Ruby on Rails migration schema. Those systems worked well enough when the company was small, but they weren’t designed to scale with more complex data needs or a fast-growing team. Changes to data models in the core GitHub repo required submitting a pull request and waiting for engineering resources, so they’d take the quick and dirty route — making changes directly in the data warehouse. Of course, that meant there was no version control and sometimes those changes would break something downstream. The result was, as Wolchonok describes it, “a bad experience for everybody.”

Solution: Building a scalable data stack on Snowflake with dbt and Metaplane 

To Daniel Wolchonok and others at Reforge, it was clear that they needed to migrate to a scalable data stack. As a newly-formed and growing data team, they wanted a data stack built specifically to fit their needs, not one cobbled together from tools designed for software engineering.

❝I want to empower each function with the best-of-breed tools, rather than one monolithic platform. There’s a wealth of tools out there that are really easy to stitch together.
Reforge uses the best-of-breed tools to build a cutting edge data platform that has helped power Reforge's use of data and rapid team growth.

Snowflake as an easy to use and scalable warehouse

Searching for a warehouse that is easy to get started with but could scale with their needs over time led Dan and his team to adopt Snowflake. Snowflake made it simple to set up development, staging, and production databases so that the data engineers could easily move data from Postgres to Snowflake. After finishing the migration, it was immediately obvious how the separation of storage and compute allowed the data team to focus on helping their teammates answer questions about the business, rather than worrying about provisioning resources or storage costs.

❝It’s blown me away how quickly you go from getting by with some quick and dirty hacks to needing tools like Snowflake and dbt to break up and model datasets, manage testing, staging environments, deployment, all that stuff.

dbt for implementing data engineering best practices and showcasing the impact of data

Introducing dbt has been a milestone for Reforge because it encouraged data engineering best practices and made it easier to migrate to Snowflake. When Reforge switched to Snowflake, all they needed to do was swap the source and the transformations would run in their new Snowflake environments. Using dbt cloud made it easier to build and maintain models over time, rather than data engineers making changes directly to the warehouse. When dbt became a part of Reforge’s CI/CD process, the data team felt empowered to own data modeling changes end to end.

With dbt, it also became easier for Reforge to showcase how data is used across the organization. Being able to see a visualization of the DAG and data lineage helped the team understand and share how data is being used.

❝Engineers can see how the data they model in dbt becomes data that other teams are acting on. They can understand why we make certain requests. It makes a world of difference in their motivation and satisfaction.

Metaplane for Data Observability

Even before his work at Reforge, Dan was aware of the challenge of data monitoring and had experienced first-hand the chaos that can happen when data issues go unnoticed. Because the data team is growing and wears many hats, they found themselves responsible for managing thousands of objects in Snowflake.  Before Metaplane, it wasn’t uncommon for them to find out tables had been missing weeks of data or the distribution of data had considerably drifted from source systems. 

Dan and his team also found it hard to keep up with the fast pace of changes from upstream application databases. While they were at the center of data, they could be the last to find out about a data quality issue. Dan and his team were often digging around in the database looking for new tables and trying to understand changes just from column names and relationships. It was fine early in Reforge’s existence, but it was clearly not scalable.

Since implementing Metaplane, Dan and his team have been the first to know about data quality issues and changes in their data. When an engineer accidentally loads contacts to the wrong place, the data team finds out immediately, allowing them to fix the problem before it compounds or goes unnoticed for weeks. With Metaplane monitoring the data pipelines and reporting back, Reforge’s Head of Data has built a different reputation with his team - one of trust.

❝Because of Metaplane, I spook people with how quickly I find data issues now. I know when events are created with crappy names, when there are new attributes, when someone puts a unique ID in a segment event. It helps me understand what's going on with our data and helps me catch issues that are painful to untangle.

Results

  • Reforge now has a scalable data stack built on Snowflake, dbt, and Metaplane that can service the entire organization and scale with a rapidly growing team
  • Using dbt and Snowflake, the data team was able to easily migrate from a poorly performing Postgres instance to a Snowflake environment where they didn’t need to constantly tune resources.
  • After migrating to Snowflake, BI queries and dbt transformation queries that used to take hours now take minutes.
  • Functions across the company have access to consistent and accurate data based on sophisticated data models built in dbt — everyone is working from the same set of data.
  • Using Metaplane, the Reforge data team is the first to know about data quality issues. When issues occur, the team has a starting point to find the root cause and an understanding of the downstream impact to models and BI dashboards.

What's Next at Reforge

Focusing on what matters most for Reforge’s continued success: insights from data

As Reforge’s team grows, they can now focus on what is most important to the business - enabling the larger organization to ask questions and find insights in the data.

They no longer need to worry about tuning their warehouse to handle increased loads because Snowflake can handle any workload size. The data team now manages data models in a scalable way and can be confident changes aren’t being made directly to the warehouse. And, as the company grows, Dan can easily show how the data is being used more easily. Lastly, the data team has preserved trust with the growing organization because they are the first to know about data quality issues and can broadcast issues they are fixing to the larger team.

More customer stories

How Bluecore Reduced Integration Issues by Over 75% with Metaplane
Rinsed washes away data quality issues with Metaplane
Census Activates Data Quality Improvements with Metaplane
Veronica Beard sets up a data stack (almost) as good looking as their clothes
Parachute Home Makes Their Data as High Quality as their Bedding with Metaplane
Dribbble designs a data quality solution with Metaplane
Metaplane keeps data right for Upright
Metaplane propels LogRocket's data quality forward
How Clear Street uses Metaplane to Prevent $100M+ Worth of Data Quality Issues
How Sigma uses Metaplane to track impacts to Sigma Workbooks
Vivian Health Improves Data Quality with dbt and Metaplane to Connect Healthcare Professionals with their Dream Job
Gorgias knows their data is always accurate and gorgeous with Metaplane
How Clearbit cleared up data quality to increase customer retention and improve detection time by 3x+
How Teachable Used Metaplane and Sigma to Achieve Their Data Quality and Visibility Goals
How SpotOn reduced time to actionable data by 6x and increased data engineering contribution by 8.5x using Snowflake, dbt Cloud, and Metaplane
​​How Car and Classic’s adoption of Metaplane, Snowflake, and dbt led to a 10x improvement in report load time and ensured trust in the data
How Imperfect Foods uses Metaplane, Snowflake, and dbt to break down data silos
How Metaplane, Snowflake, and dbt help Vendr run a lean and adaptable data team
How Mux increased test coverage from 10% to 95% with just a few clicks
How Appcues reduced data quality issues by 77% using Metaplane, Snowflake, and dbt
Ensure trust in data

Start monitoring your data in minutes.

Connect your warehouse and start generating a baseline in less than 10 minutes. Start for free, no credit-card required.