Skip to main content

Common Table Expressions (CTEs)

A Common Table Expression (CTE) is a temporary named result set that exists only during the execution of a single SQL statement. CTEs improve query readability, allow for recursive queries, and can simplify complex queries that would otherwise require nested subqueries or derived tables.

Basic Syntax

WITH cte_name AS (
SELECT column1, column2
FROM table_name
WHERE condition
)
SELECT *
FROM cte_name
WHERE another_condition;

Explanation:

  • WITH introduces the CTE.
  • cte_name is the temporary name for the result set.
  • The CTE can then be referenced in the main SELECT query as if it were a table.

Example 1: Simple CTE

Tables:

employees

employee_idnamedepartment_idsalary
1Alice10170000
2Bob10280000
3Charlie10160000

Goal: Find employees earning more than the average salary.

WITH avg_salary AS (
SELECT AVG(salary) AS avg_sal
FROM employees
)
SELECT name, salary
FROM employees, avg_salary
WHERE salary > avg_salary.avg_sal;

Output:

namesalary
Bob80000
Alice70000

Explanation:

  • The CTE avg_salary calculates a single value.
  • The main query references it for filtering, making the query clearer than a nested subquery.

Example 2: Multiple CTEs

You can define multiple CTEs in a single query.

WITH dept_avg AS (
SELECT department_id, AVG(salary) AS avg_sal
FROM employees
GROUP BY department_id
),
high_earners AS (
SELECT e.name, e.salary, e.department_id
FROM employees e
JOIN dept_avg d
ON e.department_id = d.department_id
WHERE e.salary > d.avg_sal
)
SELECT *
FROM high_earners;

Output:

namesalarydepartment_id
Alice70000101
Bob80000102

Explanation:

  • First CTE calculates department average salary.
  • Second CTE filters employees above department average.
  • Main query selects from the second CTE.
  • Readability and modularity are greatly improved.

Example 3: Recursive CTE

Definition: Recursive CTEs reference themselves. Useful for hierarchical data, e.g., organizational charts.

Table: employees (manager hierarchy)

employee_idnamemanager_id
1AliceNULL
2Bob1
3Charlie2
4David2

Goal: Get all employees under manager Alice.

WITH RECURSIVE subordinates AS (
SELECT employee_id, name, manager_id
FROM employees
WHERE manager_id IS NULL -- Alice

UNION ALL

SELECT e.employee_id, e.name, e.manager_id
FROM employees e
INNER JOIN subordinates s
ON e.manager_id = s.employee_id
)
SELECT *
FROM subordinates;

Output:

employee_idnamemanager_id
1AliceNULL
2Bob1
3Charlie2
4David2

Explanation:

  • Recursive CTE starts with the anchor member (Alice).
  • Recursively joins to find subordinates at all levels.

Advantages of CTEs

  1. Improved Readability: Break complex queries into named steps.
  2. Reusable within Query: The CTE name can be referenced multiple times.
  3. Supports Recursion: Useful for hierarchical data.
  4. Better than nested subqueries: Simplifies complex filtering, aggregation, or joins.

When to Use Joins, Subqueries, and CTEs

OperationWhen to UseNotes
JOINCombining columns from multiple tables in a single query.Efficient for large datasets; supported in all databases.
SubqueryFiltering based on aggregated or conditional values; existence checks; single-step computations.Can be correlated (row-by-row) or nested.
CTEBreaking complex queries into readable steps; recursive hierarchies; multiple references to same intermediate result.Optimized by the query planner; makes query maintenance easier.

Database Optimization Notes

FeaturePostgreSQLMySQLSQL ServerOracle
JOIN optimizationUses hash join, merge join, nested loop; well-optimizedSimilar; uses nested loops or hash joinsUses cost-based optimizer; multiple join strategiesSimilar; cost-based optimizer
Subquery optimizationCan convert to JOIN internally for efficiencyCorrelated subqueries may be slow; sometimes optimized to JOINCorrelated subqueries optimized to JOINSimilar; optimizer can flatten subqueries
CTE optimizationNon-recursive CTE may be inlinedNon-recursive CTE inlined (MySQL 8+)Non-recursive CTE inlinedNon-recursive CTE inlined; recursive executed iteratively
Recursive CTEWell-supported; optimizedSupported from 8.0; can be slower on large dataSupported; optimizedSupported; can handle large hierarchies

Note:

  • Non-recursive CTEs are often just syntactic sugar for derived tables; performance is similar to subqueries.
  • Recursive CTEs are unique; they are optimized differently than joins or subqueries.

Key Takeaways

  1. CTEs are temporary result sets that improve query clarity and structure.
  2. Recursive CTEs are ideal for hierarchical data.
  3. Subqueries are better for filtered or aggregated computations.
  4. Joins are most efficient for combining tables.
  5. Use CTEs when queries are complex, reused multiple times, or require recursion.
  6. Databases optimize these differently: CTEs are usually inlined, subqueries may be flattened into joins, and recursive CTEs are iteratively evaluated.