Get the essential data observability guide
Download this guide to learn:
What is data observability?
4 pillars of data observability
How to evaluate platforms
Common mistakes to avoid
The ROI of data observability
Unlock now
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Sign up for a free data observability workshop today.
Assess your company's data health and learn how to start monitoring your entire data stack.
Book free workshop
Sign up for news, updates, and events
Subscribe for free
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Getting started with Data Observability Guide

Make a plan to implement data observability across your company’s entire data stack

Download for free
Book a data observability workshop with an expert.

Assess your company's data health and learn how to start monitoring your entire data stack.

Book free workshop

Snowflake COALESCE: A safety net for NULL values

Learn all about how to use the COALESCE function in Snowflake, when you might use it, and how it compares to other NULL-handling functions.

and
February 13, 2025

Writer / Data

February 13, 2025
Snowflake COALESCE: A safety net for NULL values

“In this world, nothing can be said to be certain, except death and NULL values,” Benjamin Franklin once wrote.

Wait, that doesn’t seem right. Well, either way, the saying holds true. NULL values in your data warehouse are an inevitability, and it’s important to have a plan for dealing with them before they throw off a calculation and corrupt your data.

The `COALESCE` function in Snowflake is one of the best ways to handle NULL values, and in this post, we’ll walk through what it does, when you might use it, how it compares to other NULL-handling functions, and some pro tips to keep in mind.

Like Benjamin Franklin actually said, “Lost time is never found again,” so with that in mind, let’s dive in.

What is the COALESCE function in Snowflake?

The `COALESCE` function in Snowflake returns the first non-NULL value from a list of expressions. This simple but powerful tool helps you manage NULL values in your queries, ensuring that missing data doesn’t disrupt calculations or necessitate extensive case handling.

It’s sort of like finding the first available seat on a bus. If the first one is taken (NULL), you move to the next until you find an open spot.

Basic syntax for COALESCE

```sql

COALESCE(expression1, expression2, ..., expressionN)

```

  • Checks `expression1` → if not NULL, returns it and stops
  • If NULL, moves to `expression2` → if not NULL, returns it and stops
  • Keeps going until it either finds a non-NULL value or runs out of expressions
  • If everything is NULL, returns NULL

Let's see it in action with a simple example. Say you're working with user data where some profile fields might be empty:

```sql

SELECT 

    user_id,

    COALESCE(full_name, username, email, 'Anonymous User') AS display_name

FROM users;

```

This query tries to find the best available identifier for each user, starting with `full_name` and falling back to `Anonymous User` as a last resort.

Common use cases for COALESCE

The `COALESCE` function is versatile and widely used in various scenarios. Let’s explore some common ways data engineers leverage it in Snowflake.

1. Replacing NULLs in queries

When dealing with missing data, `COALESCE` ensures default values are used instead of NULL.

Imagine you’re managing a customer support database where some users haven’t provided a phone number. Instead of displaying a NULL value in reports, you’d want to indicate that the number is missing.

```sql

SELECT customer_id, COALESCE(phone_number, 'No phone available') AS contact_number

FROM customers;

```

Here’s what’s happening:

  • The query selects `customer_id` and `phone_number` from the customers table.
  • The `COALESCE` function checks if `phone_number` is NULL.
  • If `phone_number` is NULL, it substitutes it with 'No phone available'.

This ensures that every customer record displays a meaningful value, improving clarity in reports and dashboards.

2. Handling NULLs in calculations

Nothing throw a wrench in your mathematical operations like NULL values. Using `COALESCE`, you can substitute NULLs with a default numeric value so you don't throw off your operations.

Imagine you're working with an online retail dataset where some orders have missing discount values. If a discount is NULL, you need to treat it as 0 to ensure your revenue calculations are accurate.

```sql

SELECT order_id, COALESCE(discount, 0) AS adjusted_discount

FROM orders;

```

Here’s what’s happening:

  • The query retrieves `order_id` and discount from the orders table.
  • The `COALESCE` function checks if `discount` is NULL.
  • If `discount` is NULL, it substitutes it with 0, ensuring the field always has a numerical value.

This approach prevents issues when performing calculations like total revenue or average discount applied, avoiding errors caused by NULL values.

3. Concatenating strings without NULL issues

String concatenation involving NULL values can return NULL unexpectedly. `COALESCE` ensures that missing values are replaced with an empty string.

Let's say you're creating a name tag for an event, but some attendees only provided their first or last name. Instead of leaving a blank space, you want to make sure the name tag always has something written.

```sql

SELECT COALESCE(first_name, '') || ' ' || COALESCE(last_name, '') AS full_name

FROM employees;

```

Here’s what’s happening:

  • `COALESCE(first_name, '')` ensures that if `first_name` is NULL, it’s replaced with an empty string.
  • `COALESCE(last_name, '')` does the same for `last_name`.
  • The `|| ' ' ||` ensures that there is a space between the names, preventing a NULL value from affecting the concatenation.

This bit of SQL guarantees that every employee has a valid full name in the output, even if one of the components is missing.

4. Providing default dates

For time-based queries, `COALESCE` helps assign default timestamps when values are missing.

Imagine you’re running an e-commerce platform, and you want to track order deliveries. However, some deliveries don’t have a recorded delivery date. Instead of leaving those orders with NULL values, you might want to assume they were delivered today.

```sql

SELECT order_id, COALESCE(delivery_date, CURRENT_DATE) AS delivery_date

FROM orders;

```

Here’s what’s happening in this SQL:

  • The query selects `order_id` and `delivery_date` from the orders table.
  • The `COALESCE` function checks if `delivery_date` is NULL.
  • If it is NULL, it substitutes it with `CURRENT_DATE`, which returns today’s date.

This ensures that reports always have a valid delivery date, preventing NULL values from causing confusion or errors in calculations.

COALESCE vs. IFNULL vs. NVL vs. NULLIF

While `COALESCE` is a go-to for dealing with NULLs, Snowflake provides other functions that serve similar purposes. Here’s how they compare:

Function Purpose Arguments Returns Example
COALESCE Returns the first non-NULL value from a list of expressions. Multiple expressions The first non-NULL value, or NULL if all are NULL. COALESCE(NULL, NULL, 10) (returns 10)
IFNULL Replaces NULL with a specified value. 2 expressions The second expression if the first is NULL. IFNULL(NULL, 10) (returns 10)
NVL Similar to IFNULL, replaces NULL with a specified value. 2 expressions The second expression if the first is NULL. NVL(NULL, 10) (returns 10)
NULLIF Returns NULL if two expressions are equal, otherwise returns the first expression. 2 expressions NULL if the values are equal, otherwise the first expression. NULLIF(10, 10) (returns NULL)

Key Differences

  • `COALESCE` can handle multiple fallback values, while `IFNULL` and `NVL` only allow one replacement.
  • `NULLIF` is designed for a different purpose. It’s used to return NULL when two expressions match, rather than replacing NULL values.

Tips to keep in mind when using COALESCE

To make the most out of `COALESCE`, keep these best practices in mind:

1. Data type inference can be tricky: While `COALESCE` will return the data type of the first non-NULL expression, it can lead to unexpected results when dealing with more complex data types. For example, if your expressions are a mix of integers and strings, Snowflake might cast everything to a string, even if you only wanted to keep the numeric type.

2. COALESCE with dates and timestamps: `COALESCE` doesn’t always behave predictably with date and timestamp columns. If the first non-NULL value is a string or a different type of date (like a timestamp instead of a date), Snowflake might automatically convert everything to a string, which can mess up your date logic. To avoid this, be explicit with your data types or use CAST to ensure proper conversions.

3. Beware of nulls in expressions: While `COALESCE` is great for replacing NULL values, if you use it with other functions that can return NULL (like a `CASE` statement or mathematical operations), you might still get unexpected NULLs. The non-NULL expression needs to be evaluated properly before `COALESCE` can step in.

4. COALESCE with different null-handling functions: If you’re working in an environment with multiple tools or frameworks that use `COALESCE` (like dbt or external databases), be aware that different systems might handle NULLs differently. This can lead to inconsistencies when you migrate or integrate data, especially if the null replacement logic varies.

5. COALESCE doesn't perform deep NULL checks: `COALESCE` only checks for the presence of NULL in the expressions you provide, but it won’t check for other “empty” values (e.g., empty strings or arrays). If you need to treat empty strings as NULL or handle different “empty” cases, you’ll need a more custom approach using additional logic like `CASE` or `NULLIF`.

6. Ordering of expressions matters: In `COALESCE`, the order of the expressions impacts the output, since it returns the first non-NULL value. If the order isn’t intentional, you might not get the results you're looking for. Always double-check the sequence of expressions to make sure the fallback values are in the right place.

7. Can’t use COALESCE with a non-nullable column: If you try to use `COALESCE` with a non-nullable column as one of your inputs, it can cause issues if the column still ends up being NULL at runtime (perhaps due to an error in data processing). This is because non-nullable columns are supposed to always have a value, which can contradict the logic of `COALESCE`. It’s important to handle this scenario proactively.

8. Performance hit with many expressions: When using `COALESCE` with many expressions, especially in large datasets, each additional column can slightly slow down the query as the database evaluates each one. While `COALESCE` itself is optimized, overusing it in complex queries with many alternatives can add up.

These points might not always be immediately obvious but can save time debugging or optimizing queries in Snowflake.

Clean up NULL values in Snowflake with COALESCE

`COALESCE` is more than just a `NULL` handler – it's a powerful tool for writing robust, maintainable SQL. Whether you're cleaning messy data, ensuring accurate calculations, or providing sensible defaults, `COALESCE` helps you handle the unexpected with grace.

Remember: The best data engineers plan for imperfect data. `COALESCE` is your friend in making that plan work.

Looking for more Snowflake tips? Learn how to use DATEDIFF, DATE_TRUNC, IFNULL, and DATEADD.

Table of contents

    Tags

    We’re hard at work helping you improve trust in your data in less time than ever. We promise to send a maximum of 1 update email per week.

    Your email
    Ensure trust in data

    Start monitoring your data in minutes.

    Connect your warehouse and start generating a baseline in less than 10 minutes. Start for free, no credit-card required.