How Imperfect Foods uses Metaplane, Snowflake, and dbt to break down data silos

"Without Metaplane, we wouldn’t be as proactive with data quality. There would be a lot of unknown data issues out there lurking that we’d discover when someone stumbles upon them."

Adam Smith
Analytics Manager
7 Days
Reduction in Time to Identify Data Incidents
4 hrs/week
Engineering hours saved
Imperfect Foods delivers groceries with a mission to eliminate food waste and build a better food system for everyone.
https://www.imperfectfoods.com/
Industry
Food Delivery
Size
400 employees
STACK
Snowflake
dbt
Mode

Challenge: Breaking down data silos as the company and data use cases grew

Imperfect Foods is an e-commerce startup that exists to help eliminate food waste and build a better food system for everyone. Data is used by every team at Imperfect Foods and is at the center of most decisions made by the company. Whether teams are calculating customer acquisition costs for different marketing channels, running experiments for signup conversion, or trying to understand inventory and fulfillment bottlenecks, data has become critical in the company’s operations.

As the company grew, Adam Smith, an Analytics Manager at Imperfect Foods, noticed that understanding how the business was doing holistically became harder to answer because the majority of the data had become siloed across teams.

With a team of only four analysts supporting over 200 data consumers, there weren’t always enough resources to centralize and operationalize data to answer important questions. When data was being used, it wasn’t uncommon for teammates to ping the analytics Slack channel on whether the data could be trusted.

The way we found data problems was when someone posted in our Slack channel that a report wasn’t working or was returning weird results. We’d normally catch data quality problems with end users days or weeks after something happened

Adam and his team needed a way to centralize data in one place, create a scalable process to model data for everyone to use, and wanted to monitor data quality so the entire organization trusted the data.

Solution: Remove the friction of using trusted data and leverage everyone’s SQL skills

Snowflake as the centralized data cloud

Adam and his team were able to centralize data in Snowflake so any teammate could answer questions about customers, finance, and marketing. Because Snowflake’s model makes it simple to scale resources up and down, the data team didn’t have to worry about maintaining databases or spinning up resources.

In addition to easily scaling resources, Snowflake’s integrations made it easy for Adam’s team to ingest and report on data. The Snowflake data share functionality dramatically reduced the friction of using data shared by a half dozen partners. Data shares removed the need for complex ETL systems and simplified data governance and processing.

Because [Snowflake] is a market leader, everyone has a connection to Snowflake whether you're ingesting new data like a Fivetran or reporting on it using Mode

Not only did Snowflake reduce the complexities of maintaining a data stack, but the usage based model helped Imperfect manage costs more predictably and allowed them to save 20% compared to the beginning of the year.

The usage based pricing allows us to scale with the tool and have predictable costs.

With Snowflake’s separation of compute and storage, integrations, and data sharing, Adam and his team were able to finally centralize data that the entire team could use without needing to hire additional headcount.

dbt cloud for modeling data and creating leverage for the data team

Once the data was stored in Snowflake, the Imperfect Foods data team used dbt cloud to model data. Like all startups, their team was human resource constrained and needed an easy to use and versatile tool. Dbt was the perfect solution - it empowered the data analysts to model data without needing to know complex SQL knowledge.

It’s really easy for an analyst, without a lot of SQL knowledge about inserts, updates, stored procedures to use dbt. You can use select statements to do the hard work for you and leverage the skills they had to build out a data warehouse to answer the questions the organization needed.

Because dbt has software development practices and documentation built in, it helped the team move quickly and make less mistakes. Functionality like lineage and awareness of downstream impact provided easy ways for analysts to understand the downstream impact, especially when paired with Metaplane.

With dbt cloud, Adam’s team was able to easily operationalize the data in Snowflake. Analysts could model data using simple SQL statements, and the rest of the organization could rely on this data to automate business processes around product, marketing, and sales.  Because of dbt’s flexibility and scalability, the tool has grown with them over time which has allowed them to remain lean. 

dbt has been an incredibly flexible tool that has grown with us. We still don’t have the need for a data engineer or database developer.

Metaplane as their automated data observability platform

After the data team centralized data models for more of the organization to use, they still needed a way to monitor the quality of data. Because data was being used in critical paths of the business, it wasn’t enough that the data was available to everyone - it needed to be trusted, too.

Adam and his team were able to set up Metaplane in minutes because of the integrations with Snowflake, dbt and Mode. Their most important tables and columns were automatically monitored, and they started receiving schema changes in hours.

[Metaplane] is really easy to use and you don’t need any training. Automatic monitoring and the level of detail you can get right in Slack is really helpful
Every Metaplane alert includes potential root causes and the downstream impact like Mode reports to help assess the priority of the issue.

As a small team, time is hard to come by, so being able to understand the impact of data incidents and prioritize work is invaluable. Because Metaplane is monitoring all of the data tools, the team always has context when issues occur which helps them prioritize what to work on. For example, if a data incident affects a report that runs on Monday mornings, but it’s Tuesday afternoon, the team knows they don’t need to drop everything to fix the issue. 

Because of Metaplane’s integrations, we receive context with every data incident. For example, it can tell us a table is broken in Snowflake, it may have something to do with this dbt job, and it’s impacting these 5 reports. It allows you in one place to get a good handle on how big of a problem this is.

Adam and his team also use Metaplane’s dbt job duration monitoring which uses machine learning to automatically alert data teams when their dbt models start taking longer to run.

Adam and his team receive an alert when dbt jobs take longer than expected to run, helping them understand potential model bottlenecks.
I love the dbt run time monitoring which helps us understand performance and spend. To have that history there and go back, we can see the improvements we made as a team over time.

With Metaplane, the data team at Imperfect Foods could rest assured that the data they worked so hard to model could be trusted by the entire organization. If anything in their data stack failed, they became confident they would be the first to find out and could fix issues proactively.

Without Metaplane, we wouldn’t be as proactive with data quality. There would be a lot of unknown data issues out there lurking and we’d discover when someone stumbles upon them.

Results

  • Using the Snowflake data cloud, the data team could centralize all data and no longer needed to worry about scaling databases.
  • Snowflake helped centralize all partner data using data shares, saving weeks of ETL development time.
  • dbt helped the data team consistently model data using simple SQL statements, unlocking superpowers for the analysts.
  • dbt brought software engineering best practices and automation that helped the team remain lean; they didn’t need to hire an additional database administrator or data engineer.
  • Metaplane ensured trust in the data and team. Discovering data quality issues went from days or weeks to minutes or hours.

More customer stories

How Bluecore Reduced Integration Issues by Over 75% with Metaplane
Rinsed washes away data quality issues with Metaplane
Census Activates Data Quality Improvements with Metaplane
Veronica Beard sets up a data stack (almost) as good looking as their clothes
Parachute Home Makes Their Data as High Quality as their Bedding with Metaplane
Dribbble designs a data quality solution with Metaplane
Metaplane keeps data right for Upright
Metaplane propels LogRocket's data quality forward
How Clear Street uses Metaplane to Prevent $100M+ Worth of Data Quality Issues
How Sigma uses Metaplane to track impacts to Sigma Workbooks
Vivian Health Improves Data Quality with dbt and Metaplane to Connect Healthcare Professionals with their Dream Job
Gorgias knows their data is always accurate and gorgeous with Metaplane
How Clearbit cleared up data quality to increase customer retention and improve detection time by 3x+
How Teachable Used Metaplane and Sigma to Achieve Their Data Quality and Visibility Goals
How SpotOn reduced time to actionable data by 6x and increased data engineering contribution by 8.5x using Snowflake, dbt Cloud, and Metaplane
​​How Car and Classic’s adoption of Metaplane, Snowflake, and dbt led to a 10x improvement in report load time and ensured trust in the data
How Metaplane, Snowflake, and dbt help Vendr run a lean and adaptable data team
How Mux increased test coverage from 10% to 95% with just a few clicks
How Appcues reduced data quality issues by 77% using Metaplane, Snowflake, and dbt
How Reforge used Snowflake, dbt, and Metaplane to 3x their team and save 18 hours a week
Ensure trust in data

Start monitoring your data in minutes.

Connect your warehouse and start generating a baseline in less than 10 minutes. Start for free, no credit-card required.