📣 Today’s post is brought to you in collaboration with a special guest—my friend Tamás Ujhelyi from
. He's a Data Scientist/Analytics Engineer based in Hungary, and I always enjoy his articles for their unique storytelling. I highly recommend you check out his work
Want to bomb your next data science interview or guarantee your coworkers roll their eyes the next time they read your queries?
Then, by all means, don’t bother formatting your SQL.
Who cares about indenting SQL properly—it’s not like you’re writing Python.
And isn’t good coding practice just for software engineers anyway?
Well, let me tell you—overlooking proper SQL formatting is a huge mistake.
A hiring manager can tell a lot about your experience level just by looking at a few lines of your code.
Unfortunately, most people don’t learn this before “the job”.
It’s one of those ironies in data science: SQL is essential on the job, but it’s hard to truly practice before you get there.
Python usually takes center stage.
Well, today, we want to share some best practices to ensure you write production-ready SQL that will impress others.
I’ll let Tamás break it down for you.
Write CTEs, and you’ll be loved
Nothing says “I know what I'm doing” quite like using CTEs.
If you’ve ever worked with complex queries in a production environment, you know why CTEs are huge.
They:
Improve code maintainability with their easy-to-read structure.
Make debugging easier.
Foster reusability (=you can create new queries by using other queries’ CTEs).
Just look at this query—you instantly get what’s happening:
If you can build up complex queries from CTEs with easy-to-follow logic and proper namings, you’ve almost won the battle.
Speaking of which, let's talk about naming conventions.
Call them what they are
Your naming choices for tables, CTEs, and column aliases matter a lot.
A LOT.1
Not convinced?
Have fun working with queries like this:
Remember, your aim is to write production-ready queries, not playing guesswork with your teammates.
t
is a terrible name for a table; on the other hand, orders
will immediately convey what your table contains.
Same goes for you CTEs. I don’t know—and honestly, I don't even want to waste time figuring out what prep_data means. But if you call it fake_user_ids
, then I’ll think: that’s what’s up!
Also, always write out join
s’ proper name. I know a simple join
usually means inner join
, but make it explicit – don't leave room for assumptions that could lead to misunderstandings.
And misunderstandings lead to bugs.
Bugs lead to debugging.
Then debugging leads to unnecessary time that could’ve been better spent doing something else – like providing value to business.
One final piece of advice:
When you rename a column, type as
. Neglecting it may make you feel cool & clever, because you saved half a second of painful typing, but without it, you’ll easily miss where a certain renaming took place.
That’s frustrating. And frustrating is something you don’t want.
One column,
one line
Add columns in select
in new lines. This makes your query more readable:
Just imagine how hard it’d be to read the columns if you had ten or more columns all in one line. You’d literally have to scroll horizontally to check all selected columns. Not cozy at all:
Bonus point (from me, at least 🙂 ), if you use leading commas instead of trailing ones. After years of working with SQL, I found it’s much easier to catch missing comma errors if you use leading commas.
Trailing commas:
Leading commas:
See what I mean? 😉
Stick to your team’s style
If you can stay consistent in your SQL style, it means you probably got used to—or already use—a linter like SQLFluff.
This is good.
In a production environment, your team will use a linter, because it forces you to adopt the team’s SQL styling conventions.
How?
It won’t let you push your query to production until your query adheres to the linter’s rules.
For example, in this case, the columns should be on separate lines:
If everyone on your team is forced to use the same SQL styling habits, then it’ll be much easier to work on each others’ queries.
Just compare this query (from a previous example):
To this query:
Based on these queries, which person would you rather work with?
Explain the “why”, not the “what” with your comments
Take a look at this little query:
Let's be honest—is this comment helpful?
No, it is not.
The query itself says what’s happening; the comment’s just unnecessary noise. Remove it.
Let me show you an example when comments are actually useful:
When you comment on your queries, always state the reasoning and decisions behind your choices. Explain the “why” and not the “what” to help your teammates understand why your query is the way it is.
This way, you'll save your future self and your teammates from unnecessary headaches and deep dives into company history just to understand why the query was written that way.
Trust me: if you focus on explaining the “why” in your comments, your team will love you for it.
That’s not a bad thing, is it? 😉
Master these skills before your next interview
We hope by now, you understand the importance of writing proper SQL.
And if you’re prepping for data science interviews, keep in mind that SQL is almost always part of the technical assessment.
Many candidates spend all their time grinding Python and machine learning, only to get blindsided when asked to write a complex SQL query. But in reality, hiring managers expect you to be just as comfortable with SQL—because in most data roles, you’ll use it every day.
And here’s the thing: it’s not just about getting the right answer. Interviewers pay attention to how you structure your queries—whether your logic is clear, readable, and follows best practices. Writing clean SQL isn’t just a skill; it’s a signal that you know what you’re doing.
So don’t neglect it. Treat SQL as a core part of your prep, and you’ll be ready to impress.
Thank you for reading! I hope you enjoyed this collaboration and found these tips valuable
See you next week!
- Andres
Before you go, please hit the like ❤️ button at the bottom of this email to help support me. It truly makes a difference!
My Recent Posts 📩
Gentle Introduction to Hugging Face Transformers Library: Boost your NLP Skills
Maximizing Revenue in E-Commerce Through Dynamic Pricing: A Data Science Business Use Case
Better Datasets For Your Next Data Science Portfolio Project
I’m not angry with you. I just had to emphasize the importance. 🤷
Thanks Andres! As a SQL Developer, I have implemented all of these techniques. At a previous company we defined SQL standards for our team of 3 SQL developers. There was some grumbling at first, but we quickly came to love it! It made reading each others' code so much easier, and leading commas were a game-changer! Also, leading with ANDs and ORs in the WHERE clause. Another standard we used was to put SQL keywords in upper case.