# Advanced SQL Interview Questions: Senior Developer Level Challenges
## Introduction
Senior-level SQL interviews test deep understanding of database concepts, complex query optimization, and advanced SQL features. This guide covers challenging scenarios that separate senior developers from intermediate practitioners.
## Recursive Queries and CTEs
### 1. Hierarchical Data Traversal
**Question 1: Build an organization chart with employee hierarchy**
```sql
WITH RECURSIVE employee_hierarchy AS (
-- Anchor: Find top-level managers (no manager)
SELECT
employee_id,
employee_name,
manager_id,
job_title,
0 as level,
CAST(employee_name AS VARCHAR(1000)) as hierarchy_path,
CAST(employee_id AS VARCHAR(1000)) as id_path
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive: Find direct reports
SELECT
e.employee_id,
e.employee_name,
e.manager_id,
e.job_title,
eh.level + 1,
CAST(eh.hierarchy_path || ' -> ' || e.employee_name AS VARCHAR(1000)),
CAST(eh.id_path || ',' || e.employee_id AS VARCHAR(1000))
FROM employees e
INNER JOIN employee_hierarchy eh ON e.manager_id = eh.employee_id
WHERE eh.level < 10 -- Prevent infinite recursion
)
SELECT
employee_id,
REPEAT(' ', level) || employee_name as indented_name,
job_title,
level,
hierarchy_path
FROM employee_hierarchy
ORDER BY id_path;
```
**Question 2: Calculate total team size for each manager**
```sql
WITH RECURSIVE team_structure AS (
-- Base case: All employees
SELECT
employee_id,
manager_id,
employee_name,
1 as team_member_count
FROM employees
UNION ALL
-- Recursive case: Roll up team counts
SELECT
ts.manager_id as employee_id,
e.manager_id,
e.employee_name,
ts.team_member_count
FROM team_structure ts
INNER JOIN employees e ON ts.manager_id = e.employee_id
WHERE ts.manager_id IS NOT NULL
)
SELECT
e.employee_id,
e.employee_name,
e.job_title,
COALESCE(SUM(ts.team_member_count), 0) as total_team_size
FROM employees e
LEFT JOIN team_structure ts ON e.employee_id = ts.employee_id
GROUP BY e.employee_id, e.employee_name, e.job_title
ORDER BY total_team_size DESC;
```
### 2. Graph Traversal Problems
**Question 3: Find all possible paths between two nodes**
```sql
WITH RECURSIVE path_finder AS (
-- Starting point
SELECT
source_node,
target_node,
1 as path_length,
CAST(source_node || '->' || target_node AS VARCHAR(1000)) as path,
CAST(',' || source_node || ',' || target_node || ',' AS VARCHAR(1000)) as visited_nodes
FROM graph_edges
WHERE source_node = 'A' -- Starting node
UNION ALL
-- Extend paths
SELECT
pf.source_node,
ge.target_node,
pf.path_length + 1,
CAST(pf.path || '->' || ge.target_node AS VARCHAR(1000)),
CAST(pf.visited_nodes || ge.target_node || ',' AS VARCHAR(1000))
FROM path_finder pf
INNER JOIN graph_edges ge ON pf.target_node = ge.source_node
WHERE pf.path_length < 10 -- Prevent infinite loops
AND pf.visited_nodes NOT LIKE '%,' || ge.target_node || ',%' -- Avoid cycles
)
SELECT
path,
path_length
FROM path_finder
WHERE target_node = 'Z' -- Destination node
ORDER BY path_length, path;
```
## Advanced Window Functions
### 1. Complex Analytical Functions
**Question 4: Calculate percentile ranks and statistical distributions**
```sql
SELECT
employee_id,
employee_name,
department,
salary,
-- Ranking functions
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as row_num,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) as rank_val,
DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dense_rank_val,
-- Percentile functions
PERCENT_RANK() OVER (PARTITION BY department ORDER BY salary) as percent_rank,
CUME_DIST() OVER (PARTITION BY department ORDER BY salary) as cumulative_dist,
NTILE(4) OVER (PARTITION BY department ORDER BY salary) as quartile,
-- Statistical functions
AVG(salary) OVER (PARTITION BY department) as dept_avg_salary,
STDDEV(salary) OVER (PARTITION BY department) as dept_salary_stddev,
-- Lag/Lead with complex conditions
LAG(salary, 1) OVER (PARTITION BY department ORDER BY salary DESC) as next_lower_salary,
LEAD(salary, 1) OVER (PARTITION BY department ORDER BY salary DESC) as next_higher_salary,
-- First/Last value functions
FIRST_VALUE(salary) OVER (
PARTITION BY department
ORDER BY salary DESC
ROWS UNBOUNDED PRECEDING
) as highest_dept_salary,
LAST_VALUE(salary) OVER (
PARTITION BY department
ORDER BY salary DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) as lowest_dept_salary
FROM employees
ORDER BY department, salary DESC;
```
**Question 5: Moving calculations with complex window frames**
```sql
SELECT
order_date,
daily_revenue,
-- Moving averages with different windows
AVG(daily_revenue) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) as moving_avg_7_days,
AVG(daily_revenue) OVER (
ORDER BY order_date
ROWS BETWEEN 29 PRECEDING AND CURRENT ROW
) as moving_avg_30_days,
-- Exponential moving average (simplified)
SUM(daily_revenue * POWER(0.9, ROW_NUMBER() OVER (ORDER BY order_date DESC) - 1)) OVER (
ORDER BY order_date
ROWS BETWEEN 9 PRECEDING AND CURRENT ROW
) / SUM(POWER(0.9, ROW_NUMBER() OVER (ORDER BY order_date DESC) - 1)) OVER (
ORDER BY order_date
ROWS BETWEEN 9 PRECEDING AND CURRENT ROW
) as exponential_moving_avg,
-- Running variance calculation
VAR_POP(daily_revenue) OVER (
ORDER BY order_date
ROWS BETWEEN 29 PRECEDING AND CURRENT ROW
) as rolling_variance_30_days,
-- Median calculation using percentile
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY daily_revenue) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) as rolling_median_7_days
FROM (
SELECT
order_date,
SUM(total_amount) as daily_revenue
FROM orders
GROUP BY order_date
) daily_sales
ORDER BY order_date;
```
## Complex JOIN Scenarios
### 1. Multi-table Complex Joins
**Question 6: Find customers with specific purchasing patterns**
```sql
-- Find customers who bought products from at least 3 different categories
-- in the last 6 months, with total spending > $1000
WITH customer_category_purchases AS (
SELECT
c.customer_id,
c.customer_name,
p.category,
COUNT(DISTINCT oi.product_id) as products_in_category,
SUM(oi.quantity * oi.unit_price) as category_spending
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.product_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY c.customer_id, c.customer_name, p.category
),
customer_summary AS (
SELECT
customer_id,
customer_name,
COUNT(DISTINCT category) as categories_purchased,
SUM(category_spending) as total_spending,
AVG(category_spending) as avg_category_spending,
STRING_AGG(category, ', ' ORDER BY category_spending DESC) as top_categories
FROM customer_category_purchases
GROUP BY customer_id, customer_name
)
SELECT
cs.*,
ccp.category,
ccp.products_in_category,
ccp.category_spending,
ROUND(ccp.category_spending * 100.0 / cs.total_spending, 2) as category_spend_percentage
FROM customer_summary cs
INNER JOIN customer_category_purchases ccp ON cs.customer_id = ccp.customer_id
WHERE cs.categories_purchased >= 3
AND cs.total_spending > 1000
ORDER BY cs.total_spending DESC, cs.customer_id, ccp.category_spending DESC;
```
### 2. Self-Joins with Complex Logic
**Question 7: Find overlapping date ranges**
```sql
-- Find all overlapping project assignments for employees
SELECT
p1.employee_id,
e.employee_name,
p1.project_id as project_1,
p1.start_date as project_1_start,
p1.end_date as project_1_end,
p2.project_id as project_2,
p2.start_date as project_2_start,
p2.end_date as project_2_end,
-- Calculate overlap period
GREATEST(p1.start_date, p2.start_date) as overlap_start,
LEAST(p1.end_date, p2.end_date) as overlap_end,
-- Calculate overlap duration in days
EXTRACT(DAYS FROM (
LEAST(p1.end_date, p2.end_date) - GREATEST(p1.start_date, p2.start_date)
)) + 1 as overlap_days
FROM project_assignments p1
INNER JOIN project_assignments p2 ON p1.employee_id = p2.employee_id
INNER JOIN employees e ON p1.employee_id = e.employee_id
WHERE p1.project_id < p2.project_id -- Avoid duplicate pairs
AND p1.start_date <= p2.end_date -- Check for overlap
AND p2.start_date <= p1.end_date -- Check for overlap
ORDER BY p1.employee_id, overlap_days DESC;
```
## Advanced Subqueries and Set Operations
### 1. Correlated Subqueries with EXISTS
**Question 8: Find customers with above-average purchase frequency**
```sql
SELECT
c.customer_id,
c.customer_name,
c.registration_date,
(
SELECT COUNT(*)
FROM orders o
WHERE o.customer_id = c.customer_id
) as total_orders,
(
SELECT AVG(order_count)
FROM (
SELECT COUNT(*) as order_count
FROM orders
GROUP BY customer_id
) avg_calc
) as avg_orders_per_customer
FROM customers c
WHERE EXISTS (
SELECT 1
FROM orders o
WHERE o.customer_id = c.customer_id
GROUP BY o.customer_id
HAVING COUNT(*) > (
SELECT AVG(order_count)
FROM (
SELECT COUNT(*) as order_count
FROM orders
GROUP BY customer_id
) avg_calc
)
)
AND EXISTS (
-- Also ensure they've made a purchase in the last 90 days
SELECT 1
FROM orders o
WHERE o.customer_id = c.customer_id
AND o.order_date >= CURRENT_DATE - INTERVAL '90 days'
)
ORDER BY total_orders DESC;
```
### 2. Complex Set Operations
**Question 9: Advanced UNION operations with data transformation**
```sql
-- Create a unified customer activity timeline
SELECT
customer_id,
activity_date,
activity_type,
activity_description,
monetary_value,
ROW_NUMBER() OVER (
PARTITION BY customer_id
ORDER BY activity_date DESC
) as activity_sequence
FROM (
-- Order activities
SELECT
customer_id,
order_date as activity_date,
'ORDER' as activity_type,
'Order #' || order_id || ' - $' || total_amount as activity_description,
total_amount as monetary_value
FROM orders
UNION ALL
-- Support ticket activities
SELECT
customer_id,
created_date as activity_date,
'SUPPORT' as activity_type,
'Ticket #' || ticket_id || ' - ' || issue_type as activity_description,
0 as monetary_value
FROM support_tickets
UNION ALL
-- Marketing campaign interactions
SELECT
customer_id,
interaction_date as activity_date,
'MARKETING' as activity_type,
campaign_name || ' - ' || interaction_type as activity_description,
CASE
WHEN interaction_type = 'PURCHASE' THEN purchase_amount
ELSE 0
END as monetary_value
FROM marketing_interactions
UNION ALL
-- Account changes
SELECT
customer_id,
change_date as activity_date,
'ACCOUNT' as activity_type,
'Profile updated: ' || change_description as activity_description,
0 as monetary_value
FROM account_changes
) unified_activities
WHERE activity_date >= CURRENT_DATE - INTERVAL '1 year'
ORDER BY customer_id, activity_date DESC;
```
## Performance Optimization Challenges
### 1. Query Optimization Scenarios
**Question 10: Optimize a slow reporting query**
```sql
-- Original slow query (example of what NOT to do)
/*
SELECT
c.customer_name,
(SELECT COUNT(*) FROM orders WHERE customer_id = c.customer_id) as order_count,
(SELECT SUM(total_amount) FROM orders WHERE customer_id = c.customer_id) as total_spent,
(SELECT MAX(order_date) FROM orders WHERE customer_id = c.customer_id) as last_order_date
FROM customers c
WHERE c.registration_date >= '2023-01-01'
ORDER BY total_spent DESC;
*/
-- Optimized version
SELECT
c.customer_name,
COALESCE(o.order_count, 0) as order_count,
COALESCE(o.total_spent, 0) as total_spent,
o.last_order_date
FROM customers c
LEFT JOIN (
SELECT
customer_id,
COUNT(*) as order_count,
SUM(total_amount) as total_spent,
MAX(order_date) as last_order_date
FROM orders
GROUP BY customer_id
) o ON c.customer_id = o.customer_id
WHERE c.registration_date >= '2023-01-01'
ORDER BY COALESCE(o.total_spent, 0) DESC;
-- Recommended indexes:
-- CREATE INDEX idx_customers_registration_date ON customers(registration_date);
-- CREATE INDEX idx_orders_customer_id ON orders(customer_id);
-- CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);
```
## Advanced Data Manipulation
### 1. Complex UPDATE and DELETE Operations
**Question 11: Update records based on complex conditions**
```sql
-- Update customer tier based on purchase history and recency
WITH customer_metrics AS (
SELECT
c.customer_id,
COUNT(o.order_id) as total_orders,
SUM(o.total_amount) as total_spent,
MAX(o.order_date) as last_order_date,
EXTRACT(DAYS FROM (CURRENT_DATE - MAX(o.order_date))) as days_since_last_order
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id
),
tier_calculation AS (
SELECT
customer_id,
CASE
WHEN total_spent >= 10000 AND days_since_last_order <= 30 THEN 'PLATINUM'
WHEN total_spent >= 5000 AND days_since_last_order <= 60 THEN 'GOLD'
WHEN total_spent >= 1000 AND days_since_last_order <= 90 THEN 'SILVER'
WHEN total_orders > 0 AND days_since_last_order <= 180 THEN 'BRONZE'
ELSE 'INACTIVE'
END as new_tier
FROM customer_metrics
)
UPDATE customers
SET
customer_tier = tc.new_tier,
tier_updated_date = CURRENT_DATE
FROM tier_calculation tc
WHERE customers.customer_id = tc.customer_id
AND customers.customer_tier != tc.new_tier;
```
### 2. Complex Data Pivoting
**Question 12: Dynamic pivot with unknown number of columns**
```sql
-- Create a sales report pivoted by month (dynamic columns)
WITH monthly_sales AS (
SELECT
p.category,
p.product_name,
TO_CHAR(o.order_date, 'YYYY-MM') as order_month,
SUM(oi.quantity * oi.unit_price) as monthly_revenue
FROM products p
INNER JOIN order_items oi ON p.product_id = oi.product_id
INNER JOIN orders o ON oi.order_id = o.order_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY p.category, p.product_name, TO_CHAR(o.order_date, 'YYYY-MM')
)
SELECT
category,
product_name,
SUM(CASE WHEN order_month = TO_CHAR(CURRENT_DATE - INTERVAL '11 months', 'YYYY-MM') THEN monthly_revenue END) as month_1,
SUM(CASE WHEN order_month = TO_CHAR(CURRENT_DATE - INTERVAL '10 months', 'YYYY-MM') THEN monthly_revenue END) as month_2,
SUM(CASE WHEN order_month = TO_CHAR(CURRENT_DATE - INTERVAL '9 months', 'YYYY-MM') THEN monthly_revenue END) as month_3,
SUM(CASE WHEN order_month = TO_CHAR(CURRENT_DATE - INTERVAL '8 months', 'YYYY-MM') THEN monthly_revenue END) as month_4,
SUM(CASE WHEN order_month = TO_CHAR(CURRENT_DATE - INTERVAL '7 months', 'YYYY-MM') THEN monthly_revenue END) as month_5,
SUM(CASE WHEN order_month = TO_CHAR(CURRENT_DATE - INTERVAL '6 months', 'YYYY-MM') THEN monthly_revenue END) as month_6,
SUM(CASE WHEN order_month = TO_CHAR(CURRENT_DATE - INTERVAL '5 months', 'YYYY-MM') THEN monthly_revenue END) as month_7,
SUM(CASE WHEN order_month = TO_CHAR(CURRENT_DATE - INTERVAL '4 months', 'YYYY-MM') THEN monthly_revenue END) as month_8,
SUM(CASE WHEN order_month = TO_CHAR(CURRENT_DATE - INTERVAL '3 months', 'YYYY-MM') THEN monthly_revenue END) as month_9,
SUM(CASE WHEN order_month = TO_CHAR(CURRENT_DATE - INTERVAL '2 months', 'YYYY-MM') THEN monthly_revenue END) as month_10,
SUM(CASE WHEN order_month = TO_CHAR(CURRENT_DATE - INTERVAL '1 month', 'YYYY-MM') THEN monthly_revenue END) as month_11,
SUM(CASE WHEN order_month = TO_CHAR(CURRENT_DATE, 'YYYY-MM') THEN monthly_revenue END) as month_12,
SUM(monthly_revenue) as total_revenue
FROM monthly_sales
GROUP BY category, product_name
ORDER BY category, total_revenue DESC;
```
## Interview Strategy for Advanced Questions
### 1. Problem-Solving Approach
1. **Understand the business context**
2. **Break down complex problems into smaller parts**
3. **Consider performance implications**
4. **Think about edge cases and data quality**
5. **Explain your reasoning clearly**
### 2. Code Quality Considerations
- **Use meaningful aliases and comments**
- **Structure queries for readability**
- **Consider maintainability**
- **Handle NULL values appropriately**
- **Use appropriate data types**
### 3. Performance Optimization
- **Understand execution plans**
- **Know when to use indexes**
- **Consider query complexity**
- **Think about scalability**
- **Discuss trade-offs**
## Common Advanced Interview Topics
1. **Recursive CTEs and hierarchical data**
2. **Complex window functions and analytics**
3. **Advanced JOIN scenarios**
4. **Performance optimization techniques**
5. **Data modeling and normalization**
6. **Stored procedures and functions**
7. **Transaction management and concurrency**
8. **Database design patterns**
## Conclusion
Advanced SQL interviews test your ability to:
- **Solve complex business problems with SQL**
- **Optimize queries for performance**
- **Handle large-scale data scenarios**
- **Design efficient database solutions**
- **Communicate technical concepts clearly**
Success requires deep understanding of SQL internals, extensive practice with complex scenarios, and the ability to think systematically about data problems.
Ready to tackle advanced SQL challenges? [Start your SQL interview preparation →](/interview)