Relational Algebra
This note is complete, reviewed, and considered stable.
Relational Algebra is a formal system of query operations used on relational databases. It provides a mathematical foundation for understanding how database queries work and serves as the theoretical basis for SQL. Relational algebra operations allow us to manipulate relations (tables) and extract meaningful information from data.
Basic Operations
Relational algebra defines six fundamental operations that form the foundation for all query processing:
Selection (σ)
Selection filters tuples (rows) based on a condition. It returns a subset of rows that satisfy the specified predicate.
Syntax
σ_condition(relation)
Example
σ_salary > 50000(Employees)
This returns all employees with a salary greater than 50,000.
Properties
- Unary operation (operates on a single relation)
- Commutative:
σ_c1(σ_c2(R)) = σ_c2(σ_c1(R)) - Condition can use comparison operators
(=, <, >, <=, >=, !=)and logical operators (AND, OR, NOT)
Projection (π)
Projection selects specific columns (attributes) from a relation. It removes unwanted columns and returns a vertical subset of the relation.
Syntax
π_attributes(relation)
Example
π_name, salary(Employees)
This returns only the name and salary columns from the Employees table.
Properties
- Unary operation
- Removes duplicate tuples from the result
- Useful for eliminating irrelevant attributes
Cartesian Product (×)
Cartesian Product combines all tuples from two relations, creating a new relation with all possible combinations of rows.
Syntax
relation1 × relation2
Example
Employees × Departments
If Employees has 10 rows and Departments has 5 rows, the result has 50 rows.
Properties
- Binary operation
- Commutative: R1 × R2 = R2 × R1
- Associative: (R1 × R2) × R3 = R1 × (R2 × R3)
Union (∪)
Union combines tuples from two relations, returning all tuples that appear in either relation.
Syntax
relation1 ∪ relation2
Example
Managers ∪ Directors
This returns all people who are either managers or directors.
Properties
- Binary operation
- Both relations must have the same schema (union-compatible)
- Commutative: R1 ∪ R2 = R2 ∪ R1
- Associative: (R1 ∪ R2) ∪ R3 = R1 ∪ (R2 ∪ R3)
- Removes duplicate tuples
Set Difference (−)
Set Difference returns tuples that appear in the first relation but not in the second.
Syntax
relation1 − relation2
Example
Employees − Managers
This returns all employees who are not managers.
Properties
- Binary operation
- Both relations must have the same schema
- Not commutative: R1 − R2 ≠ R2 − R1
- Not associative
Intersection (∩)
Intersection returns tuples that appear in both relations.
Syntax
relation1 ∩ relation2
Example
FullTimeEmployees ∩ DepartmentA
This returns employees who are both full-time and in Department A.
Properties
- Binary operation
- Both relations must have the same schema
- Commutative: R1 ∩ R2 = R2 ∩ R1
- Associative: (R1 ∩ R2) ∩ R3 = R1 ∩ (R2 ∩ R3)
- Can be expressed using other operations: R1 ∩ R2 = R1 − (R1 − R2)
Derived Operations
Derived operations are built from combinations of basic operations and are essential for practical queries:
Join (⋈)
Join combines tuples from two relations based on a join condition. It is one of the most important operations in relational algebra.
Types of Joins:
Natural Join
A Natural Join combines relations on their common attributes, automatically eliminating duplicate columns.
Syntax
relation1 ⋈ relation2
Example
Employees ⋈ Departments
This joins employees with departments based on matching department IDs.
Theta Join
A Theta Join combines tuples from two relations based on a specified condition using any comparison operator.
Syntax
relation1 ⋈_condition relation2
Example
Employees ⋈_(Employees.dept_id = Departments.id) Departments
Equi-Join
An Equi-Join is a theta join where the condition uses only equality.
Example
Employees ⋈_(Employees.dept_id = Departments.id) Departments
Properties
- Binary operation
- Produces a Cartesian product filtered by the join condition
- Can be expressed as:
R1 ⋈_c R2 = σ_c(R1 × R2)
Outer Joins
Outer Joins preserve unmatched tuples from one or both relations:
Left Outer Join (⟕)
Retains all tuples from the left relation, padding with NULL values when no match exists in the right relation.
Example
Employees ⟕ Departments
All employees are included, even those without assigned departments.
Right Outer Join (⟖)
Retains all tuples from the right relation, padding with NULL values when no match exists in the left relation.
Full Outer Join (⟛)
Retains all tuples from both relations, padding with NULL values where matches don't exist.
Division (÷)
Division finds tuples in the first relation that are associated with all tuples in the second relation. It's useful for queries like "find all customers who have ordered every product."
Syntax
relation1 ÷ relation2
Example
Orders ÷ Products
This returns customers who have ordered every product.
Properties
- Binary operation
- Not commutative
- Complex operation, often decomposed into simpler operations
Aggregate Functions
Relational algebra extends with aggregate functions for summarization:
Aggregation (γ)
Computes a single value from a set of values.
Common Aggregate Functions:
- COUNT: Counts the number of tuples or non-null values
- SUM: Sums all values in a column
- AVG: Calculates the average of values
- MIN: Finds the minimum value
- MAX: Finds the maximum value
Syntax
γ_aggregation_function(relation)
Example
γ_COUNT(*), SUM(salary)(Employees)
Grouping
Groups tuples based on specific attributes and applies aggregate functions to each group.
Syntax
relation GROUP BY attributes AGGREGATION_FUNCTION