Skip to main content

GROUP BY Queries

The GROUP BY clause in SQL is used to group rows that have the same values in one or more columns. It’s mainly used along with aggregate functions (like SUM(), AVG(), COUNT(), MIN(), MAX()) to compute summary statistics for each group.

It helps us transform row-level data into aggregated insights, such as “total sales per region” or “average salary per department”.

Without GROUP BY, aggregate functions treat the entire dataset as one group. With it, we can compute aggregates for each unique value or combination of values.

Basic Syntax

SELECT column1, aggregate_function(column2)
FROM table_name
GROUP BY column1;

Key points:

  • Every column in the SELECT clause must either appear in the GROUP BY clause or be used inside an aggregate function.
  • The GROUP BY clause always appears after WHERE but before HAVING and ORDER BY.

Basic Example

Suppose we have a table sales:

regionproductamount
EastApple100
WestOrange200
EastApple150
WestBanana50

Query:

SELECT region, SUM(amount) AS total_sales
FROM sales
GROUP BY region;

Output:

regiontotal_sales
East250
West250

The GROUP BY region clause combines rows with the same region and applies SUM() on the amount column for each group.

GROUP BY with Multiple Columns

We can group by multiple columns to create nested or hierarchical groupings.

SELECT region, product, SUM(amount) AS total_sales
FROM sales
GROUP BY region, product;

Output:

regionproducttotal_sales
EastApple250
WestBanana50
WestOrange200

Here, grouping happens for each (region, product) pair.

GROUP BY with WHERE and HAVING

Both WHERE and HAVING are used for filtering, but they act at different stages:

ClauseFiltersStage
WHERERows (before grouping)Before grouping
HAVINGGroups (after grouping)After grouping

Example:

SELECT region, SUM(amount) AS total_sales
FROM sales
WHERE amount > 50
GROUP BY region
HAVING SUM(amount) > 200;

Explanation:

  • WHERE filters out rows where amount <= 50.
  • Then rows are grouped by region.
  • Finally, HAVING filters out groups whose SUM(amount) is not greater than 200.

GROUP BY with ORDER BY

You can sort aggregated results using ORDER BY.

SELECT region, SUM(amount) AS total_sales
FROM sales
GROUP BY region
ORDER BY total_sales DESC;

This sorts groups by total sales in descending order.

GROUP BY with Aliases and Expressions

We can group by computed or derived values.

SELECT UPPER(region) AS region_upper, SUM(amount)
FROM sales
GROUP BY UPPER(region);

Here, grouping is done based on the uppercase form of region names.

GROUP BY with Column Position

Some databases allow grouping by the position of the column in the SELECT list.

SELECT region, product, SUM(amount)
FROM sales
GROUP BY 1, 2;

This groups by the first and second selected columns (region and product).

Note:

  • MySQL, PostgreSQL, and Oracle support this.
  • SQL Server discourages it and recommends explicit column names.

GROUPING SETS, ROLLUP, and CUBE

When we need multiple groupings in a single query, instead of using multiple UNION queries, we can use GROUPING SETS, ROLLUP, or CUBE. These are part of the SQL:1999 standard.

GROUPING SETS

GROUPING SETS allows explicit specification of multiple independent grouping combinations in a single query.

SELECT region, product, SUM(amount) AS total_sales
FROM sales
GROUP BY GROUPING SETS ((region), (product), ());

Result:

regionproducttotal_sales
EastNULL250
WestNULL250
NULLApple250
NULLOrange200
NULLBanana50
NULLNULL550

Explanation:

  • (region) → groups by region.
  • (product) → groups by product.
  • () → represents a grand total (no grouping).

This is equivalent to doing:

SELECT region, NULL AS product, SUM(amount) FROM sales GROUP BY region
UNION ALL
SELECT NULL AS region, product, SUM(amount) FROM sales GROUP BY product
UNION ALL
SELECT NULL, NULL, SUM(amount) FROM sales;

but in a single, efficient query.

ROLLUP

ROLLUP is used for hierarchical subtotals — it performs a top-down aggregation from detailed to summary levels.

SELECT region, product, SUM(amount) AS total_sales
FROM sales
GROUP BY ROLLUP(region, product);

Result:

regionproducttotal_sales
EastApple250
EastNULL250
WestBanana50
WestOrange200
WestNULL250
NULLNULL500

Explanation:

  • (region, product) → specific combinations.
  • (region) → subtotal per region.
  • () → grand total.

Thus, ROLLUP gives cumulative totals up the hierarchy.

CUBE

CUBE creates all possible grouping combinations, producing a multidimensional aggregation (cross-tabulated summary).

SELECT region, product, SUM(amount) AS total_sales
FROM sales
GROUP BY CUBE(region, product);

Result:

regionproducttotal_sales
EastApple250
EastNULL250
WestBanana50
WestOrange200
WestNULL250
NULLApple250
NULLBanana50
NULLOrange200
NULLNULL550

Explanation:

  • Includes all group combinations:

    • (region, product) — detailed
    • (region) — subtotal per region
    • (product) — subtotal per product
    • () — grand total

CUBE is ideal for multidimensional analytics (like pivot tables).

GROUPING() and GROUPING_ID()

In ROLLUP and CUBE, NULL values may appear for subtotals. To differentiate between a real NULL and an aggregated subtotal, SQL provides the GROUPING() function.

SELECT region, product,
SUM(amount) AS total_sales,
GROUPING(region) AS g_region,
GROUPING(product) AS g_product
FROM sales
GROUP BY CUBE(region, product);

Output:

regionproducttotal_salesg_regiong_product
EastApple25000
EastNULL25001
NULLApple25010
NULLNULL55011

If GROUPING(column) = 1, that column’s value is the result of aggregation (not a real data value).

GROUPING_ID() (supported in SQL Server, Oracle, and PostgreSQL) combines multiple grouping flags into a single integer bitmask for complex analysis.

Differences Across Databases

FeaturePostgreSQLMySQLSQL ServerOracle
Basic GROUP BYFully supportedFully supportedFully supportedFully supported
GROUPING SETSSupportedSupported (since 8.0.18)SupportedSupported
ROLLUPSupportedSupportedSupportedSupported
CUBESupportedSupported (since 8.0.18)SupportedSupported
GROUPING()SupportedSupportedSupportedSupported
GROUPING_ID()SupportedNot supportedSupportedSupported
HAVING without GROUP BYSupportedSupportedSupportedSupported
Alias in GROUP BYNot allowed (must use column name)AllowedAllowedAllowed
Column position groupingSupportedSupportedDiscouragedSupported

Notes:

  • MySQL < 8.0 didn’t support ROLLUP, CUBE, or GROUPING SETS.
  • PostgreSQL adheres closely to the SQL standard and is strict about alias usage.
  • Oracle and SQL Server provide additional analytic extensions like GROUPING_ID() and better optimizer handling for rollups.