3 Sneaky SQL Traps You Need to Watch Out For in Your Next Analysis
You don't have to learn the hard way
SQL can be sneaky.
Especially since there are so many SQL databases out there, and they don’t all behave the same.
Sometimes, the query runs without errors, but the output isn’t what you expect. No warnings. No red flags. Just silently incorrect data that can lead to bad decisions.
I’ve learned this the hard way, but today, you don’t have to.
Here are three SQL traps that can slip under the radar and lead to incorrect conclusions if you’re not careful.
1. Using = NULL instead of IS NULL
You write a query like this:
And you get… nothing.
No error. No warning. Just 0 rows returned.
So you assume your data has no NULLs, but unfortunately, that may not actually be the case.
The problem?
In SQL, NULL
represents an unknown value. Comparing anything to NULL
with =
always evaluates to UNKNOWN
, which means your condition never matches any rows.
What to do instead:
Use IS NULL
or IS NOT NULL
to check for nulls:
This actually returns all rows where last_login
is NULL
.
2. COUNT(*) vs. COUNT(column_name)
At first glance, these two might seem interchangeable. But they’re not.
COUNT(*)
counts all rows, including those withNULL
values.COUNT(column_name)
only counts rows where that column is NOT NULL.
If you assume they return the same number, you might end up with a false sense of how much data you actually have.
Example:
If total_users = 1,000
and users_with_email = 800
, that means 200 users have a NULL email, a crucial insight you would have missed if you only used COUNT(email)
.
3. UNION ALL mixing up misordered columns
Some SQL databases don’t check if columns in a UNION ALL
queries are in the right order, they simply assume they match based on position.
But even those who do throw an error in case of mismatch, like it’s the case for BigQuery, will still assume they match based on position if the data types align.
Example:
If email
and user_id
are the same data type (like STRING
), this query won’t throw an error, but now your user_id
column contains emails, and your email
column contains user IDs.
What to do instead?
Always explicitly define column names in a UNION ALL
to avoid misalignment.
💡 If you work with BigQuery, you can now do UNION ALL BY NAME which takes care of the order for you since it matches based on name and not position.
Final Thoughts
These SQL traps are dangerous precisely because they don’t break your queries.
Your database won’t warn you, and if you’re not careful, you could end up making bad decisions based on misleading data.
Now you know what to watch out for!
Thank you for reading! I hope you found these tips will help you avoid lots of pain and suffering 😅
See you next week!
- Andres
Before you go, please hit the like ❤️ button at the bottom of this email to help support me. It truly makes a difference!
This was such an insightful read! The breakdown of IS NULL vs. = NULL really cleared up some common misconceptions, and the distinction between COUNT(*) and COUNT(column_name) was a great refresher. Also, the reminder about using the right order with UNION ALL is something every SQL user needs to keep in mind. Thanks for sharing these sneaky traps, we definitely need to watch out for them!
Great insights!