Get the essential data observability guide
Download this guide to learn:
What is data observability?
4 pillars of data observability
How to evaluate platforms
Common mistakes to avoid
The ROI of data observability
Unlock now
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Sign up for a free data observability workshop today.
Assess your company's data health and learn how to start monitoring your entire data stack.
Book free workshop
Sign up for news, updates, and events
Subscribe for free
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Getting started with Data Observability Guide

Make a plan to implement data observability across your company’s entire data stack

Download for free
Book a data observability workshop with an expert.

Assess your company's data health and learn how to start monitoring your entire data stack.

Book free workshop

Data vibe coding, SQL etiquette, and how to land a data engineering gig

Welcome to —a monthly roundup of news you can use from around the data world.

March 26, 2025
Blog Author Image

Co-founder / Data and ML

Writer / Data

March 26, 2025
Data vibe coding, SQL etiquette, and how to land a data engineering gig

The data world is, as you might expect, chronically online. It’s also pretty scattered. There are tons of great conversations happening across Reddit, Twitter/X, Substack, Medium, and yes, even LinkedIn.

So, in the spirit of data engineering, here’s an attempt to take information from multiple sources, funnel it to a central location, and format it for you to read and enjoy. Welcome to Overheard in data, March ‘25 edition.

Data and vibe coding

I’m not exactly sure what to even call vibe coding. A hobby? A discipline? A method? While its effectiveness is still up for debate, there’s no doubt it’s pretty fun and a quick way to get a concept from idea to minimum viable product. 

While most of the vibe coding projects we’ve seen so far have been in the software world, Jacob Matson’s question raised the question of how it translates to data.

The vibe coding trend raises interesting questions for data engineering—particularly around how much engineers should rely on LLMs for production-ready code.

Why this matters for data teams:

1. SQL queries generated by LLMs often 𝘭𝘰𝘰𝘬 correct but can silently introduce errors, especially with complex transformations or edge cases

2. When data engineers don't fully understand their pipelines, debugging becomes challenging when something breaks

3. The path of least resistance is tempting and there are genuine efficiency gains to be had

The most effective data engineers I know are finding the sweet spot: using AI to accelerate routine tasks (especially via tab completion) while deepening their understanding of core systems. They can explain every critical pipeline component, even if an LLM helped draft it.

I hope we can continue to toe the line, where GenAI tools augment our capabilities rather than replace our expertise. Expect the teams who successfully thread this needle to start pulling ahead of the pack.

Data as a mold problem

On LinkedIn, Stephen Bailey gave a really thoughtful framework for analyzing problems in terms of either:

  • Rust logic: Treating systems as mechanical, with aging/wearing/corrosion as primary risks
  • Mold logic: Treating parasitic/invasive growths as primary risks

I love how this perspective reframes our understanding of data issues: "Bad data is like mold spores; flexible schemas are like unsealed surfaces. Once they get in, it's actually unknowable where they might end up." Dashboard graveyards aren't just cluttered—they're decomposing feeding grounds for data parasites.

Rather than attempting complete sterilization, Bailey suggests "managed overgrowth"—treating data management like gardening. Allow some natural decay, limit growth in specific areas, and replace compromised systems with hardier varieties.

This approach shifts what makes a 10x data engineer away from just technical prowess, and more toward ecosystem awareness—understanding how bad data infiltrates and spreads throughout your systems.

Why are we all screaming in SQL?

There’s not a lot of insight to add to this one, but this made me laugh more than any other data engineering-related post did all month.

What’s your SQL style—all caps? All lowercase? Somewhere in between?

Why you aren’t landing a DE job (and how to land one)

One of the most common topics of conversation across r/DataEngineering and beyond is how to land a job in data engineering—whether you’re a fresh grad or switching from an adjacent profession.

One Reddit user put out some reasons why they think you might now be landing the gig—the chief reasons being:

  • We’re seeing fewer entry-level roles because LLMs perform much of that work
  • Data engineering never really was an entry-level role, but more of something people stumbled into
  • You live outside of a major city and, therefore, your options for working somewhere hiring DE roles are limited
  • You need to work on your people skills (maybe it’s all the SQL screaming?)

I also came across some help for those of you looking to nail a data engineering interview.

Big thanks to Zach Morris Wilson on Twitter/X, who put out a concise but very valuable cheat sheet for passing your SQL interviews. It includes practical SQL tips like when to use specific functions over others, as well as soft skill tips like asking questions about the assignment and being able to articulate your rationale.

Table of contents

    Tags

    We’re hard at work helping you improve trust in your data in less time than ever. We promise to send a maximum of 1 update email per week.

    Your email
    Ensure trust in data

    Start monitoring your data in minutes.

    Connect your warehouse and start generating a baseline in less than 10 minutes. Start for free, no credit-card required.