Skip to main content

GROUP BY

GROUP BY

Description

GROUP BY GROUPING SETSCUBEROLLUP is an extension of the GROUP BY clause. It can realize the aggregation of groups of multiple sets in a GROUP BY clause. The result is equivalent to the UNION operation of multiple corresponding GROUP BY clauses.

GROUP BY clause is a special case of GROUP BY GROUPING SETS containing only one element. For example, the GROUPING SETS statement:

SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) );

The query result is equivalent to:

SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b
UNION
SELECT a, null, SUM( c ) FROM tab1 GROUP BY a
UNION
SELECT null, b, SUM( c ) FROM tab1 GROUP BY b
UNION
SELECT null, null, SUM( c ) FROM tab1

GROUPING(expr) indicates whether a column is an aggregate column. If it is an aggregate column, it is 0, otherwise it is 1.

GROUPING_ID(expr [ , expr [ , ... ] ]) is similar to GROUPING. GROUPING_ ID calculates the bitmap value of a column list according to the specified column order, and each bit is the value of GROUPING.

GROUPING_ID() function returns the decimal value of the bit vector.

Syntax

SELECT ...
FROM ...
[ ... ]
GROUP BY [
, ... |
GROUPING SETS [, ...] ( groupSet [ , groupSet [ , ... ] ] ) |
ROLLUP(expr [ , expr [ , ... ] ]) |
expr [ , expr [ , ... ] ] WITH ROLLUP |
CUBE(expr [ , expr [ , ... ] ]) |
expr [ , expr [ , ... ] ] WITH CUBE
]
[ ... ]

Parameters

groupSet represents a set composed of columns, aliases or expressions in the select list. groupSet ::= { ( expr [ , expr [ , ... ] ] )}

expr indicates the column, alias or expression in the select list.

Note

starrocks supports syntax like PostgreSQL. The syntax examples are as follows:

SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) );
SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY ROLLUP(a,b,c)
SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY CUBE(a,b,c)

ROLLUP(a,b,c) is equivalent to the followingGROUPING SETS statement:

GROUPING SETS (
(a,b,c),
( a, b ),
( a),
( )
)

CUBE ( a, b, c ) is equivalent to the followingGROUPING SETS statement:

GROUPING SETS (
( a, b, c ),
( a, b ),
( a, c ),
( a ),
( b, c ),
( b ),
( c ),
( )
)

Examples

The following is an example of actual data:

> SELECT * FROM t;
+------+------+------+
| k1 | k2 | k3 |
+------+------+------+
| a | A | 1 |
| a | A | 2 |
| a | B | 1 |
| a | B | 3 |
| b | A | 1 |
| b | A | 4 |
| b | B | 1 |
| b | B | 5 |
+------+------+------+
8 rows in set (0.01 sec)

> SELECT k1, k2, SUM(k3) FROM t GROUP BY GROUPING SETS ( (k1, k2), (k2), (k1), ( ) );
+------+------+-----------+
| k1 | k2 | sum(`k3`) |
+------+------+-----------+
| b | B | 6 |
| a | B | 4 |
| a | A | 3 |
| b | A | 5 |
| NULL | B | 10 |
| NULL | A | 8 |
| a | NULL | 7 |
| b | NULL | 11 |
| NULL | NULL | 18 |
+------+------+-----------+
9 rows in set (0.06 sec)

> SELECT k1, k2, GROUPING_ID(k1,k2), SUM(k3) FROM t GROUP BY GROUPING SETS ((k1, k2), (k1), (k2), ());
+------+------+---------------+----------------+
| k1 | k2 | grouping_id(k1,k2) | sum(`k3`) |
+------+------+---------------+----------------+
| a | A | 0 | 3 |
| a | B | 0 | 4 |
| a | NULL | 1 | 7 |
| b | A | 0 | 5 |
| b | B | 0 | 6 |
| b | NULL | 1 | 11 |
| NULL | A | 2 | 8 |
| NULL | B | 2 | 10 |
| NULL | NULL | 3 | 18 |
+------+------+---------------+----------------+
9 rows in set (0.02 sec)