Writing Recursive CTEs Without a Headache

Recursive Common Table Expressions (CTEs) are one of SQL's most powerful features, yet they often intimidate even experienced developers. If you've ever struggled with infinite loops, confusing syntax, or performance issues when writing recursive queries, this guide is for you.


What Are Recursive CTEs?

A recursive CTE is a temporary result set that references itself, allowing you to process hierarchical or graph-based data in a relational database. They're perfect for use cases like:

  • Organizational charts

  • Bill of materials (product assemblies)

  • Network or graph traversals

  • Generating series of values

The Basic Structure

Every recursive CTE has three key components:

WITH RECURSIVE cte_name AS (
    -- Base case (non-recursive term)
    SELECT initial_data

    UNION [ALL]

    -- Recursive case
    SELECT additional_data
    FROM cte_name
    WHERE termination_condition
)
SELECT * FROM cte_name;

A Simple Example: Number Generation

Let’s start with generating numbers from 1 to 10:

WITH RECURSIVE numbers AS (
    -- Base case: start with 1
    SELECT 1 AS n

    UNION ALL

    -- Recursive case: add 1 until we reach 10
    SELECT n + 1
    FROM numbers
    WHERE n < 10
)
SELECT n FROM numbers;

This demonstrates the pattern clearly:

  1. Start with initial data (1)

  2. Keep adding rows (n + 1) until the condition fails (n < 10)

Practical Example: Employee Hierarchy

Imagine an employees table with id, name, and manager_id columns. To find all subordinates under a specific manager:

WITH RECURSIVE org_chart AS (
    -- Base case: start with the manager
    SELECT id, name, manager_id, 1 AS level
    FROM employees
    WHERE id = 100  -- Starting manager ID

    UNION ALL

    -- Recursive case: get all direct reports
    SELECT e.id, e.name, e.manager_id, oc.level + 1
    FROM employees e
    JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT * FROM org_chart ORDER BY level;

Avoiding Common Pitfalls

  1. Infinite Loops: Always include a proper termination condition.

    • Use a depth counter (like level in the example above)

    • Track visited nodes for graph traversals

  2. Performance Issues:

    • Use UNION instead of UNION ALL when possible to eliminate duplicates

    • Add appropriate indexes on join columns

    • Consider limiting recursion depth for very large hierarchies

  3. Missing Base Case: Your recursive CTE must start with a non-recursive query.

Advanced Technique: Path Tracking

For graph traversals, you often need to track the complete path from one node to another:

WITH RECURSIVE graph_path AS (
    SELECT 
        node_id, 
        ARRAY[node_id] AS path,
        1 AS depth
    FROM graph
    WHERE node_id = 'start_node'

    UNION ALL

    SELECT 
        g.node_id,
        gp.path || g.node_id,
        gp.depth + 1
    FROM graph g
    JOIN graph_path gp ON g.parent_id = gp.node_id
    WHERE NOT g.node_id = ANY(gp.path)  -- Prevent cycles
)
SELECT * FROM graph_path;

When Not to Use Recursive CTEs

While powerful, recursive CTEs aren't always the best solution:

  • For simple hierarchical queries, consider using window functions

  • For very deep or complex hierarchies, a graph database may be more appropriate

  • When your database engine doesn’t support recursion efficiently or imposes strict depth limits

Final Tips

  1. Test incrementally: Build your recursive CTE step by step

  2. Visualize your data: Draw the hierarchy or graph to understand relationships

  3. Limit results during development: Add LIMIT 20 to prevent runaway queries

  4. Check your database’s documentation: Recursion limits and optimization techniques vary across platforms

Recursive CTEs become much easier with practice. Start with simple examples, understand the underlying pattern, and you'll be ready to tackle complex hierarchical queries with confidence.

Comments

Popular posts from this blog

Migrating SQL Server to Azure SQL Database: A Step-by-Step Guide

Common Causes of Slow Queries in SQL Server and How to Fix Them

MS SQL Server Performance Optimization: Best Practices & Tips