Not all GROUPING SETS are created equal

Introduction

I’ve just realized, once again, that the Oracle Optimizer is even smarter than I thought.
The comments (by Iudith Mentzel and Thomas Mautsch) to my previous post, GROUPING SETS and COLLECT don’t get along, made me understand that not all GROUPING SETS were created equal.

The examples in this post are from Oracle 18.3

Extended Aggregation Options

GROUPING SETS, ROLLUP and CUBE are great features that enable us to perform several aggregations on the same population in a single statement, that is shorter, more elegant, and usually more efficient, than the alternative of combining (e.g., using UNION ALL) the results of the individual aggregations.

In my opinion, GROUPING SETS, ROLLUP and CUBE should be in the toolbox of every Oracle developer. If you are not familiar with them, I highly recommend reading Tim Hall‘s article https://oracle-base.com/articles/misc/rollup-cube-grouping-functions-and-grouping-sets.

Under the Hood

We can use GROUPING SETS with many different combinations, but the implementation under the hood may be different for different combinations.
Let’s take, for example, the following two queries. Note that the first query has to perform 4 aggregations (some of them with a composite grouping key), while the second query performs only 2 aggregations (with a single column grouping key).
Try to guess which of the two will run faster…

select grouping_id(x,y,z) grp_id,
       x,
       y,
       z,
       count(w)
from   t
group  by grouping sets((x),(y,x),(z,x,y),());

select grouping_id(x,y) grp_id,
       x,
       y,
       count(w)
from   t
group  by grouping sets((x),(y));

This is (of course) a trick question. First, because I wouldn’t ask it if the answer was the intuitive one. And second, because the answer depends on various factors which I didn’t show, like the table structure and its content. But anyway, take a guess, you have nothing to lose 🙂

Setup

So let’s create the table and populate it with some data.

SQL> create table t (
  2    x number,
  3    y number,
  4    z number,
  5    w char(1000)
  6  );

Table created.

SQL>
SQL> insert into t (x,y,z,w)
  2  select mod(rownum, 3),
  3         mod(rownum, 3)+1,
  4         mod(rownum, 3)+2,
  5         rownum
  6  from dual
  7  connect by level <= 1000000;

1000000 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

The Test

Let’s turn timing and statistics on:

SQL> set timing on
SQL> set autotrace on stat

And execute the queries.

SQL> select grouping_id(x,y,z) grp_id,
  2         x,
  3         y,
  4         z,
  5         count(w)
  6  from   t
  7  group  by grouping sets((x),(y,x),(z,x,y),());

    GRP_ID          X          Y          Z   COUNT(W)
---------- ---------- ---------- ---------- ----------
         0          0          1          2     333333
         1          0          1                333333
         3          0                           333333
         0          1          2          3     333334
         1          1          2                333334
         3          1                           333334
         0          2          3          4     333333
         1          2          3                333333
         3          2                           333333
         7                                     1000000

10 rows selected.

Elapsed: 00:00:00.40

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     143027  consistent gets
          0  physical reads
          0  redo size
       1015  bytes sent via SQL*Net to client
        623  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         10  rows processed

SQL> select grouping_id(x,y) grp_id,
  2         x,
  3         y,
  4         count(w)
  5  from   t
  6  group  by grouping sets((x),(y));

    GRP_ID          X          Y   COUNT(W)
---------- ---------- ---------- ----------
         2                     1     333333
         2                     2     333334
         2                     3     333333
         1          1                333334
         1          2                333333
         1          0                333333

6 rows selected.

Elapsed: 00:00:06.85

Statistics
----------------------------------------------------------
       1118  recursive calls
     145711  db block gets
     428973  consistent gets
     142858  physical reads
        624  redo size
        890  bytes sent via SQL*Net to client
        623  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          6  rows processed

The first query performed 4 aggregations in less than a second, while the second query performed 2 (allegedly simpler) aggregations on the same population in more than 6 seconds and consumed much more resources.

Analysis of the First Query

The reason for the “winning” of the first query is that the 4 grouping keys in this query are “telescopic” – they are all included (either partially or completely) in the leading part of the same key – (x,y,z)

()
(x)
(x,y)
(x,y,z)

To make it a bit more interesting, I used (y,x) in the query instead of (x,y) and (z,x,y) instead of (x,y,z), but they are equivalent as grouping keys

So in this case it’s possible to sort the table by x,y,z and to calculate all 4 aggregations in a single scan. And indeed, that’s what Oracle does here; the Optimizer has a special operation for that – SORT GROUP BY ROLLUP. Here is the execution plan of the first query:

SQL> select grouping_id(x,y,z) grp_id,
  2         x,
  3         y,
  4         z,
  5         count(w)
  6  from   t
  7  group  by grouping sets((x),(y,x),(z,x,y),());

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |    14 | 14140 | 40080   (1)| 00:00:02 |
|   1 |  SORT GROUP BY ROLLUP|      |    14 | 14140 | 40080   (1)| 00:00:02 |
|   2 |   TABLE ACCESS FULL  | T    |  1000K|   963M| 40007   (1)| 00:00:02 |
-----------------------------------------------------------------------------

GROUPING SETS with these 4 grouping keys is exactly the same as ROLLUP(x,y,z), so the operation name SORT GROUP BY ROLLUP makes lots of sense.
Even if we use just a subset of a ROLLUP, Oracle can still take advantage of the SORT GROUP BY ROLLUP operation.
For example:

SQL> select grouping_id(x,y,z) grp_id,
  2         x,
  3         y,
  4         z,
  5         count(*)
  6  from   t
  7  group  by grouping sets((x),(x,y,z));

    GRP_ID          X          Y          Z   COUNT(*)
---------- ---------- ---------- ---------- ----------
         0          0          1          2     333333
         3          0                           333333
         0          1          2          3     333334
         3          1                           333334
         0          2          3          4     333333
         3          2                           333333

6 rows selected.

Elapsed: 00:00:00.41

Execution Plan
----------------------------------------------------------
Plan hash value: 372929889

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |    14 |   126 | 40079   (1)| 00:00:02 |
|   1 |  SORT GROUP BY ROLLUP|      |    14 |   126 | 40079   (1)| 00:00:02 |
|   2 |   TABLE ACCESS FULL  | T    |  1000K|  8789K| 40006   (1)| 00:00:02 |
-----------------------------------------------------------------------------

Analysis of the Second Query

What about the second query? I used GROUPING SETS ((x),(y)), and here the “telescopic” (or rollup) characteristic does not apply.
Let’s see the execution plan:

SQL> select grouping_id(x,y) grp_id,
  2         x,
  3         y,
  4         count(w)
  5  from   t
  6  group  by grouping sets((x),(y));

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                            |     7 |   364 |   114K  (1)| 00:00:05 |
|   1 |  TEMP TABLE TRANSFORMATION               |                            |       |       |            |          |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D667A_2180326 |       |       |            |          |
|   3 |    TABLE ACCESS FULL                     | T                          |  1000K|   960M| 40007   (1)| 00:00:02 |
|   4 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D667B_2180326 |       |       |            |          |
|   5 |    HASH GROUP BY                         |                            |     3 |  3012 | 37160   (1)| 00:00:02 |
|   6 |     TABLE ACCESS FULL                    | SYS_TEMP_0FD9D667A_2180326 |  1000K|   957M| 37086   (1)| 00:00:02 |
|   7 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D667B_2180326 |       |       |            |          |
|   8 |    HASH GROUP BY                         |                            |     3 |  3012 | 37160   (1)| 00:00:02 |
|   9 |     TABLE ACCESS FULL                    | SYS_TEMP_0FD9D667A_2180326 |  1000K|   957M| 37086   (1)| 00:00:02 |
|  10 |   VIEW                                   |                            |     3 |   156 |     2   (0)| 00:00:01 |
|  11 |    TABLE ACCESS FULL                     | SYS_TEMP_0FD9D667B_2180326 |     3 |   126 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------

Oracle loads the table into a temporary table (SYS_TEMP_0FD9D667A_2180326).
Then it scans SYS_TEMP_0FD9D667A_2180326 twice, for performing two separate “simple” GROUP BY operations (one by X and one by Y; using HASH GROUP BY operations), and inserts the results into a second temporary table (SYS_TEMP_0FD9D667B_2180326).
And finally it scans SYS_TEMP_0FD9D667B_2180326 for returning the combined results.

Summary

In this post we saw two possible execution plans for queries with GROUP BY GROUPING SETS.
I will write about additional plans in the next post.

Hi Oren,

Amazing analysis

but a more complex case:

select x,y,z from t group by cube(x), rollup(y,z);

the plan is:

—————————————————————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
—————————————————————————————————————————
| 0 | SELECT STATEMENT | | 1 | 39 | 15 (20)| 00:00:01 |
| 1 | TEMP TABLE TRANSFORMATION | | | | | |
| 2 | MULTI-TABLE INSERT | | | | | |
| 3 | SORT GROUP BY ROLLUP | | 1 | 39 | 3 (34)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T | 1 | 39 | 2 (0)| 00:00:01 |
| 5 | DIRECT LOAD INTO (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6DEC_3AB9AAFD | | | | |
| 6 | DIRECT LOAD INTO (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6DED_3AB9AAFD | | | | |
| 7 | MULTI-TABLE INSERT | | | | | |
| 8 | SORT GROUP BY ROLLUP | | 1 | 26 | 3 (34)| 00:00:01 |
| 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6DEC_3AB9AAFD | 1 | 26 | 2 (0)| 00:00:01 |
| 10 | DIRECT LOAD INTO (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6DEE_3AB9AAFD | | | | |
| 11 | DIRECT LOAD INTO (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6DED_3AB9AAFD | | | | |
| 12 | LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_0FD9D6DED_3AB9AAFD | | | | |
| 13 | SORT GROUP BY ROLLUP | | 1 | 13 | 3 (34)| 00:00:01 |
| 14 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6DEE_3AB9AAFD | 1 | 13 | 2 (0)| 00:00:01 |
| 15 | VIEW | | 3 | 117 | 6 (0)| 00:00:01 |
| 16 | VIEW | | 3 | 117 | 6 (0)| 00:00:01 |
| 17 | UNION-ALL | | | | | |
| 18 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6DEC_3AB9AAFD | 1 | 39 | 2 (0)| 00:00:01 |
| 19 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6DED_3AB9AAFD | 1 | 39 | 2 (0)| 00:00:01 |
| 20 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6DEE_3AB9AAFD | 1 | 26 | 2 (0)| 00:00:01 |
—————————————————————————————————————————

the key pointer is at the cursor memory duration, and the temp file can be reused.

3 thoughts on “Not all GROUPING SETS are created equal”

Iudith Mentzel says:

July 4, 2019 at 5:30 pm

Hi Oren,

Amazing analysis, hats off :):)

I guess that copying the data to the temporary table for the second query is done for achieving read consistency of the 2 independent
GROUP BY results.
I just wonder why does this approach look “more appealing” to Oracle
than using (internally) a flashback to the overall query starting SCN,
which it is doing anyway for any query, with the only difference that here
this should be repeated for each GROUP BY operation.

Or, maybe this decision depends in some way on the table size ?
And/or, possibly, on some (tracked) “change behavior” of the table ?

Waiting eagerly for you further posts 🙂

Cheers & Have a nice weekend !
Iudith

1. Oren Nakdimon says:
  
  July 4, 2019 at 5:47 pm
  
  Thanks Iudith 🙂
  And thanks for the comment in the previous post, that triggered my investigation.
  
  I believe the use of temporary table here is an attempt to reduce I/O (assuming the base row set will fit in memory), similarly to the materialization of subquery factoring.
  I will show in the next post an example where the optimizer chooses to scan the actual table twice instead of using a temporary table.
  
  Thanks,
  Oren.
  
casa says:

February 16, 2020 at 6:24 am

Hi Oren,

Amazing analysis

but a more complex case:

select x,y,z from t group by cube(x), rollup(y,z);

the plan is:

—————————————————————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
—————————————————————————————————————————
| 0 | SELECT STATEMENT | | 1 | 39 | 15 (20)| 00:00:01 |
| 1 | TEMP TABLE TRANSFORMATION | | | | | |
| 2 | MULTI-TABLE INSERT | | | | | |
| 3 | SORT GROUP BY ROLLUP | | 1 | 39 | 3 (34)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T | 1 | 39 | 2 (0)| 00:00:01 |
| 5 | DIRECT LOAD INTO (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6DEC_3AB9AAFD | | | | |
| 6 | DIRECT LOAD INTO (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6DED_3AB9AAFD | | | | |
| 7 | MULTI-TABLE INSERT | | | | | |
| 8 | SORT GROUP BY ROLLUP | | 1 | 26 | 3 (34)| 00:00:01 |
| 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6DEC_3AB9AAFD | 1 | 26 | 2 (0)| 00:00:01 |
| 10 | DIRECT LOAD INTO (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6DEE_3AB9AAFD | | | | |
| 11 | DIRECT LOAD INTO (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6DED_3AB9AAFD | | | | |
| 12 | LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_0FD9D6DED_3AB9AAFD | | | | |
| 13 | SORT GROUP BY ROLLUP | | 1 | 13 | 3 (34)| 00:00:01 |
| 14 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6DEE_3AB9AAFD | 1 | 13 | 2 (0)| 00:00:01 |
| 15 | VIEW | | 3 | 117 | 6 (0)| 00:00:01 |
| 16 | VIEW | | 3 | 117 | 6 (0)| 00:00:01 |
| 17 | UNION-ALL | | | | | |
| 18 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6DEC_3AB9AAFD | 1 | 39 | 2 (0)| 00:00:01 |
| 19 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6DED_3AB9AAFD | 1 | 39 | 2 (0)| 00:00:01 |
| 20 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6DEE_3AB9AAFD | 1 | 26 | 2 (0)| 00:00:01 |
—————————————————————————————————————————

the key pointer is at the cursor memory duration, and the temp file can be reused.