This module covers Window Functions, Recursive Queries, Set Operations, Pivoting and Unpivoting.
Window functions are powerful tools in SQL that allow you to perform calculations across a set of table rows that are somehow related to the current row. They are especially useful for running totals, cumulative aggregates, ranking, and more.
SELECT
employee_id,
name,
salary,
ROW_NUMBER() OVER(ORDER BY salary DESC) AS row_num
FROM employees;
This example assigns a unique sequential number to each employee based on their salary in descending order.
SELECT
department,
name,
salary,
ROW_NUMBER() OVER(PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees;
In this example, the `PARTITION BY` clause ensures that row numbers reset for each department.
Both functions assign rankings, but `RANK()` leaves gaps when there are ties (e.g., two employees with the same salary), while `DENSE_RANK()` does not leave gaps.
SELECT
date,
sales,
LAG(sales, 1) OVER(ORDER BY date) AS prev_sales,
LEAD(sales, 1) OVER(ORDER BY date) AS next_sales
FROM monthly_sales;
This query retrieves the previous and next month's sales values for each row.
Recursive queries are powerful SQL techniques used to handle hierarchical or self-referential data. They allow you to query data that refers back to itself, such as organizational structures, family trees, or nested categories.
A recursive query uses a Common Table Expression (CTE) to repeatedly execute the query until a specified condition is met. This allows you to traverse hierarchies and retrieve data that would otherwise require multiple queries.
CTEs are temporary result sets defined within a single query. They can be used to simplify complex queries and make them more readable. Recursive CTEs have two parts: the anchor member and the recursive member.
WITH cte_name AS (
-- Anchor part
SELECT ...
UNION ALL
-- Recursive part
SELECT ...
)
SELECT * FROM cte_name;
WITH EmployeeHierarchy AS (
-- Anchor part
SELECT employee_id, first_name, last_name, manager_id
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive part
SELECT e.employee_id, e.first_name, e.last_name, e.manager_id
FROM employees e
INNER JOIN EmployeeHierarchy eh ON e.manager_id = eh.employee_id
)
SELECT * FROM EmployeeHierarchy;
Set operations in SQL are powerful tools for combining, intersecting, or excluding result sets from multiple queries. These operations allow you to manipulate data at a higher level of abstraction, making complex queries more manageable and efficient.
The `UNION` operator combines two or more result sets into a single set, eliminating duplicate rows. Use `UNION ALL` when you want to include all results, including duplicates.
SELECT column1 FROM table1
UNION
SELECT column1 FROM table2;
Use `UNION` when you want to combine two tables but don't want duplicates. For example, combining user data from multiple sources.
SELECT id, name FROM employees
UNION ALL
SELECT id, name FROM contractors;
Use `UNION ALL` when performance is a priority and you don't mind duplicates. This operation doesn't check for duplicates, so it's faster.
The `INTERSECT` operator returns only the rows that are present in both result sets. This is useful for finding common elements between two datasets.
SELECT id FROM orders
INTERSECT
SELECT id FROM customers;
Use `INTERSECT` when you need to find overlapping data between tables, such as finding users who are both active and premium.
The `EXCEPT` operator returns rows from the first query that are not present in the second query. This is useful for finding differences between two datasets.
SELECT id FROM employees
EXCEPT
SELECT id FROM contractors;
Use `EXCEPT` when you need to find data that exists in one table but not in another, such as identifying users who haven't completed a certain action.
Pivoting and unpivoting are powerful techniques in SQL that allow you to dynamically transform the shape of your data. These operations enable you to convert rows into columns (pivot) or columns into rows (unpivot), making it easier to analyze and present your data in various formats.
The `PIVOT` clause allows you to rotate rows into columns based on specified values. This is particularly useful for creating crosstabular reports or aggregating data across multiple dimensions.
SELECT Product, [2023] AS Sales_2023, [2024] AS Sales_2024
FROM (
SELECT Product, Year, SUM(Sales) AS TotalSales
FROM SalesData
GROUP BY Product, Year
) AS SourceTable
PIVOT (
SUM(TotalSales)
FOR Year IN (2023, 2024)
) AS PivotTable;
In this example, we're pivoting sales data by year, creating dynamic columns for each year.
The `UNPIVOT` clause transforms columns back into rows. This is useful when you need to revert data from a pivoted format or normalize it for further processing.
SELECT Product, Year, Sales
FROM (
SELECT Product, [2023], [2024]
FROM PivotTable
) AS SourceTable
UNPIVOT (
Sales FOR Year IN ([2023], [2024])
) AS UnpivotTable;
Here, we're unpivoting the sales data to return it to a normalized format.
When you need more flexibility, `CASE` expressions can be used to dynamically determine pivot columns.
SELECT Product,
SUM(CASE WHEN Year = 2023 THEN Sales ELSE 0 END) AS Sales_2023,
SUM(CASE WHEN Year = 2024 THEN Sales ELSE 0 END) AS Sales_2024
FROM SalesData
GROUP BY Product;
// Bad example: Using wildcards without aliasing
SELECT *
FROM SalesData
PIVOT (
SUM(Sales)
FOR Year IN (2023, 2024)
);
Always explicitly alias columns to maintain clarity and control over the output.
For high-performance pivoting, ensure that the columns used in the `FOR` clause are properly indexed and have a manageable number of unique values.
Question 1 of 16