Phuong Nguyen recently shared something that caught my attention: "⏰ C'est dans 1h ! A 12h30, on se retrouve en live pour progresser en SQL." They followed with a situation many of us recognize: you know the syntax, you have watched tutorials, you can write queries, and yet "les questions métier restent floues" and the result still does not answer the business need.

That gap is the real skill. And I like how Phuong framed the root cause: the problem is not SQL keywords, it is "la façon dont vous partez de la question métier pour écrire votre requête". In other words, translating business intent into a precise data question, and only then into SQL.

Below, I want to expand on Phuong Nguyen's point with a practical workflow, common pitfalls, and a few examples you can reuse in interviews and on the job.

SQL is rarely the bottleneck

Most people learn SQL in the same order:

SELECT, WHERE, GROUP BY
joins
window functions
some performance basics

Then reality hits. Stakeholders do not ask, "Can you write a LEFT JOIN?" They ask, "How many active customers do we have?" Or, "Did the campaign increase retention?" Or, "Why did revenue drop last week?"

Phuong Nguyen described that moment perfectly: you can write queries, but you still doubt in interviews or in a role because the business questions are fuzzy, the expected output is unclear, and subtle wording changes the correct SQL.

The best SQL practitioners are not the ones who know the most functions. They are the ones who can turn ambiguity into definitions.

The hidden step: translate the business question

When Phuong said they would work "comme en entreprise" with "données proches de la réalité terrain", it highlights the missing context in many tutorials: real data is messy, and business language is imprecise.

I use a simple translation layer before writing SQL. It looks like this.

Step 1: Restate the question in one sentence

Take the stakeholder request and rewrite it in your own words. If you cannot restate it clearly, you cannot query it correctly.

Example:

Stakeholder: "How many active users do we have?"
Your restatement: "Count distinct users who performed at least one qualifying action in the last 28 days."

Notice what changed: we added action, window, and counting rule.

Step 2: Define the metric and the grain

Two definitions matter more than anything:

Metric definition: what exactly are you counting or summing?
Grain: what is one row in the result?

Common grains: per day, per week, per user, per account, per order.

If the grain is wrong, everything downstream is wrong: joins duplicate rows, sums inflate, and conversion rates lie.

Step 3: Identify the entities and tables

List the entities implied by the question: users, sessions, orders, invoices, tickets, products.

Then map them to tables and keys:

user_id
order_id
account_id
event_id

This is where a lot of "I know SQL but I'm stuck" happens. Not because of SQL, but because you do not yet know your data model well enough. In a live session like the one Phuong announced, practicing this mapping on realistic datasets is exactly what builds speed.

Step 4: Decide filters, edge cases, and exclusions

Phuong mentioned "les subtilités dans les questions, celles qui changent totalement la requête finale". This is the part.

Ask explicitly:

Time window: last 7 days, last full week, month to date, rolling 30?
Timezone: UTC, local time, account timezone?
Status filters: paid orders only, exclude refunds, exclude test accounts?
Deduplication: multiple events per user, multiple invoices per order?

Step 5: Choose the output that answers the business need

A stakeholder might need:

a single KPI number
a trend line over time
a breakdown by segment
a list of entities to action

Different outputs imply different SQL patterns. A KPI might be one aggregated row. A trend requires a date spine. A list requires limiting and sorting.

A concrete example: one subtlety, totally different SQL

Imagine an e-commerce dataset with orders and order_items.

Business question: "What is revenue last month?"

Clarifying questions that change the query:

Do we mean gross revenue or net revenue?
Do we include taxes and shipping?
Do we exclude canceled orders or include them until refunded?
Do we define "last month" as the previous calendar month or the last 30 days?

Here is a simplified pattern that avoids common mistakes (double counting because of joins, and unclear filters):

SELECT
  DATE_TRUNC('month', order_date) AS month,
  SUM(order_total_amount) AS revenue
FROM orders
WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL '1 month'
  AND order_date <  DATE_TRUNC('month', CURRENT_DATE)
  AND order_status = 'paid'
GROUP BY 1;

If you join to order_items and sum item prices without care, you can easily double count shipping fees, discounts, or taxes. If you filter on item status instead of order status, you might exclude legitimate revenue. These are not "SQL syntax" issues. They are definition issues.

When someone says "the query is wrong", it is often because the definitions were never agreed.

The workplace workflow Phuong hints at

Phuong Nguyen emphasized practicing "pas à pas" with "réflexes concrets" to write cleaner and faster queries. In a work setting, those reflexes are less about typing speed and more about reducing rework.

Here is the workflow I recommend (and what I suspect Phuong teaches in that live format):

1) Sketch the logic before writing SQL

Write a short plan:

filter orders to paid, within the time window
aggregate at the right grain
validate against a known number or a small sample

This prevents the common trap: writing a complex query first and debugging definitions later.

2) Build in layers (CTEs are your friend)

Layered queries make intent visible:

base_orders with clean filters
daily_revenue aggregated
final select for reporting format

This also makes interviews easier: you can explain each step and show your reasoning.

3) Validate with quick checks

Before you ship an analysis:

sanity check row counts
check for duplicates after joins
compare totals to a dashboard or last week's output
spot check a few user_ids or order_ids

Validation is what turns a "working query" into a trustworthy answer.

Why this matters for interviews and confidence

Phuong wrote that even people who can write SQL still doubt "en entretien ou en poste". I have seen this repeatedly: interview questions are often disguised translation tasks.

Example prompt: "Find our most valuable customers." That is not a SQL question. It is:

define "valuable" (LTV? revenue in last 90 days? margin?)
choose a time window
decide whether value is per user or per account
handle refunds and cancellations

The candidate who asks 2-3 smart clarifying questions and then writes a simple, correct query will outperform the candidate who jumps into window functions immediately.

If you want to improve fast, practice the translation, not the syntax

Phuong Nguyen's invitation included two details I think are underrated: "Aucun logiciel à installer" and "Vous პრაქტiquez en direct". Removing setup friction and practicing live forces you to focus on reasoning, not tooling.

If you are practicing on your own, recreate that environment:

pick a realistic dataset (orders, subscriptions, support tickets, events)
write 10 business questions yourself
for each, write the definition first, then the SQL
compare two versions when wording changes

That repetition builds the reflex Phuong described: the ability to spot subtleties and adjust the query confidently.

Closing thought

What I took from Phuong Nguyen's post is simple: SQL skill is not just knowing SQL. It is knowing how to start from a business question and end with a useful, defensible answer. Once you train that translation step, your queries get cleaner, your results get trusted, and interviews feel a lot less intimidating.

This blog post expands on a viral LinkedIn post by Phuong Nguyen, Programme Accélérateur Projet Portfolio | 2.7K lecteurs de ma newsletter pour progresser en data | Data Analyst Freelance. View the original LinkedIn post →