SORT GROUP BY NOSORT ROLLUP

In the previous post we saw that the Oracle Optimizer has a special operation โ€“ SORT GROUP BY ROLLUP โ€“ for performing several aggregations of the same population in a single scan, as long as the grouping keys are in a “rollup form”; for example:

(a)
(a,b,c)
(a,b,c,d)

If there is an index that its leading part is the same as the rollup grouping key, and that at least one of its columns is defined as NOT NULL, and if the optimizer thinks that cost-wise it is worth it, then the sorting part of the operation is omitted and the operation becomes SORT GROUP BY NOSORT ROLLUP.

Here is an example (executed in Oracle 18.3).

SQL> create table t (
  2    x number not null,
  3    y number,
  4    z number,
  5    w char(1000)
  6  );

Table created.

SQL> insert into t (x,y,z,w)
  2  select mod(rownum, 3),
  3         mod(rownum, 3)+1,
  4         mod(rownum, 3)+2,
  5         rownum
  6  from dual
  7  connect by level <= 1000
  8  order by 1,2,3;

1000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

SQL> create index i on t (x,y,z);

Index created.

SQL> set timing on
SQL> set autotrace on
SQL> select grouping_id(x,y) grp_id,
  2         x,
  3         y,
  4         count(z)
  5  from   t
  6  group  by grouping sets ((x),(x,y));

    GRP_ID          X          Y   COUNT(Z)
---------- ---------- ---------- ----------
         0          0          1        333
         1          0                   333
         0          1          2        334
         1          1                   334
         0          2          3        333
         1          2                   333

6 rows selected.

Elapsed: 00:00:00.05

Execution Plan
----------------------------------------------------------
Plan hash value: 3362344319

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     3 |    27 |     4   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT ROLLUP|      |     3 |    27 |     4   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN           | I    |  1000 |  9000 |     4   (0)| 00:00:01 |
------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          5  consistent gets
          4  physical reads
          0  redo size
        889  bytes sent via SQL*Net to client
        624  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          6  rows processed

Not all GROUPING SETS are created equal

Introduction

I’ve just realized, once again, that the Oracle Optimizer is even smarter than I thought.
The comments (by Iudith Mentzel and Thomas Mautsch) to my previous post, GROUPING SETS and COLLECT don’t get along, made me understand that not all GROUPING SETS were created equal.

The examples in this post are from Oracle 18.3

Extended Aggregation Options

GROUPING SETS, ROLLUP and CUBE are great features that enable us to perform several aggregations on the same population in a single statement, that is shorter, more elegant, and usually more efficient, than the alternative of combining (e.g., using UNION ALL) the results of the individual aggregations.

In my opinion, GROUPING SETS, ROLLUP and CUBE should be in the toolbox of every Oracle developer. If you are not familiar with them, I highly recommend reading Tim Hall‘s article https://oracle-base.com/articles/misc/rollup-cube-grouping-functions-and-grouping-sets.

Under the Hood

We can use GROUPING SETS with many different combinations, but the implementation under the hood may be different for different combinations.
Let’s take, for example, the following two queries. Note that the first query has to perform 4 aggregations (some of them with a composite grouping key), while the second query performs only 2 aggregations (with a single column grouping key).
Try to guess which of the two will run faster… Continue reading “Not all GROUPING SETS are created equal”

GROUPING SETS and COLLECT Don’t Get Along

I’ve recently got reminded that assuming something will work, just because it makes sense, doesn’t mean it will really work.

While reviewing some code a few days ago, I saw a query of the following form:

select 'X='||x, collect(z)
from t
group by x
union all
select 'Y='||y, collect(z)
from t
group by y;

I immediately recommended to convert it to use GROUPING SETS; like this:

select decode(grouping(x), 0, 'X='||x, 'Y='||y),
       collect(z)
from t
group by grouping sets (x,y);

The code will be shorter, more elegant, and probably more efficient. Great, isn’t it?
The only problem is that it doesn’t work ๐Ÿ™

Let’s create a demo table:

SQL> create table t (
  2    x number,
  3    y number,
  4    z number
  5  );

Table created.

SQL> insert into t (x,y,z)
  2  select mod(rownum, 2),
  3         mod(rownum, 3),
  4         rownum
  5  from dual
  6  connect by level <= 10;

10 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

GROUPING SETS works nicely with most of the aggregate functions…

SQL> select decode(grouping(x), 0, 'X='||x, 'Y='||y) grp_by,
  2         count(*) "CNT",
  3         count(distinct z) "CNT D",
  4         sum(z) "SUM",
  5         avg(z) "AVG",
  6         stddev(z) "STDDEV",
  7         min(z) "MIN",
  8         max(z) "MAX"
  9  from t
 10  group by grouping sets (x,y)
 11  order by 1;

GRP_BY      CNT CNT D  SUM  AVG STDDEV  MIN  MAX
---------- ---- ----- ---- ---- ------ ---- ----
X=0           5     5   30    6   3.16    2   10
X=1           5     5   25    5   3.16    1    9
Y=0           3     3   18    6      3    3    9
Y=1           4     4   22  5.5   3.87    1   10
Y=2           3     3   15    5      3    2    8

5 rows selected.

… but not with the COLLECT aggregate function.
In Oracle 11.2 and 12.1 we get ORA-604 and ORA-907:

SQL> select decode(grouping(x), 0, 'X='||x, 'Y='||y),
  2         collect(z)
  3  from t
  4  group by grouping sets (x,y);
group by grouping sets (x,y)
             *
ERROR at line 4:
ORA-00604: error occurred at recursive SQL level 1
ORA-00907: missing right parenthesis

Looks like a bug.
But in Oracle 12.2 something has changed. No, the bug was not fixed. Instead, it is now officially not supported:

SQL> select decode(grouping(x), 0, 'X='||x, 'Y='||y),
  2         collect(z)
  3  from t
  4  group by grouping sets (x,y);
       collect(z)
       *
ERROR at line 2:
ORA-03001: unimplemented feature

-- tested in 12.2, 18.3 and 19.3

So, at least for now, the original query should remain as is:

SQL> select 'X='||x grp_by, collect(z) coll
  2  from t
  3  group by x
  4  union all
  5  select 'Y='||y, collect(z)
  6  from t
  7  group by y;

GRP_BY     COLL
---------- --------------------------------------------------
X=0        ST00001HGTfH6lTUWkKMCXAmZAQg=(2, 10, 8, 6, 4)
X=1        ST00001HGTfH6lTUWkKMCXAmZAQg=(1, 9, 7, 5, 3)
Y=0        ST00001HGTfH6lTUWkKMCXAmZAQg=(3, 9, 6)
Y=1        ST00001HGTfH6lTUWkKMCXAmZAQg=(1, 10, 7, 4)
Y=2        ST00001HGTfH6lTUWkKMCXAmZAQg=(2, 8, 5)

EXPAND_SQL_TEXT – Much More Than Just Expanding Views

Overview

There are features in Oracle SQL that are implemented by other, older, features. This is a clever way for supporting a new syntax with low efforts and low risk โ€“ the Oracle Corp engineers only need to convert the SQL statement with the new syntax to an equivalent statement that uses the old syntax they already support. And Oracle has a perfect place for doing this conversion โ€“ the expansion stage in the parsing process.

SQL Expansion

When a SQL statement is processed, it goes through several stages, in this order: parsing, optimization, row source generation, and execution.

Note: Parsing is a confusing term, as many times when we say โ€œparsingโ€ (especially โ€œhard parsingโ€) we actually mean โ€œparsing + optimization + row source generationโ€.

The first stage, the parsing, is not too complex (comparing to the optimization stage). Continue reading “EXPAND_SQL_TEXT – Much More Than Just Expanding Views”

ODC Appreciation Day : Pattern Matching in SQL

Hereโ€™s my contribution to the ODC Appreciation Day.

Pattern Matching in SQL, using the MATCH_RECOGNIZE clause, is one of my favorite features, but only recently I’ve used it “for real”.
MATCH_RECOGNIZE allows us to perform enhanced analysis of row sequences, and to detect sequences that match complex patterns.
This feature gave a significant boost to the analytical capabilities of SQL. It enables solving various types of problems in a simpler way than before, in much more elegant ways than before, and many times in much better performance.

I learned this feature back in 2013, just shortly after Oracle 12cR1 โ€“ the version that introduced the MATCH_RECOGNIZE clause โ€“ had been released.
The first time I talked about it was in November 2013, as part of a full day seminar I gave about Oracle 12c new features for developers.
I have been playing with it ever since โ€“ for example here, here and here – and have learned to appreciate its powerful capabilities.
I was really excited while giving a presentation about Pattern Matching at ilOUG Tech Days 2015. The reason for my excitement was that Keith Laker, the product manager of the feature I was just presenting, attended the session.

But I didn’t have a chance to use MATCH_RECOGNIZE in real-life use cases until recently. I did identify many cases where Pattern Matching would perfectly fit, but I couldn’t use it as I was still using Oracle 11g. Once starting using Oracle 12c, I have been enjoying using MATCH_RECOGNIZE for solving real problems in my work (although my first attempt revealed a nasty bug).

In a few weeks from now I’m going to present (a completely new version of) Oracle SQL Pattern Matching Made Easy in BGOUG Autumn Conference and in DOAG 2018. The purpose of this session is to explain the MATCH_RECOGNIZE feature from the basics, step-by-step, with many examples, and by that demonstrating how powerful and useful it is, and at the same time eliminating the fear from what may seem at first as complex syntax. If you’re going to be in one of these conferences, come and say hi ๐Ÿ™‚

Here are some good resources for learning about Pattern Matching and its uses:

Index Hints and Distributed Queries

There are two ways to specify indexes in optimizer hints: by the index name, or by the index’s (leading) columns:

The latter is usually preferred, as the writer’s intention is clearer, and it’s immune to changes of the index name.

Note: one can present the opposite argument, that specifying the index name is immune to changes of column names

Recently I’ve used the option that I prefer โ€“ specifying the column names and not the index name โ€“ in a distributed query, hinting the remote table, and found out that it didn’t work.
The hint in the query that was passed to the remote database contained three question marks instead of the column name that I had specified.

Apparently, this weird replacement of column names by question marks happens quite early in the query execution, during the expansion stage:

SQL> set serveroutput on
SQL> DECLARE
  2      v_clob CLOB;
  3  BEGIN
  4      dbms_utility.expand_sql_text(input_sql_text  => 'select * from dual where exists (select /*+ index (t,(x)) */ null from t@dblink t where  x = :x and y = :y)',
  5                                   output_sql_text => v_clob);
  6      dbms_output.put_line(v_clob);
  7  END;
  8  /
SELECT "A1"."DUMMY" "DUMMY" FROM "SYS"."DUAL" "A1" WHERE  EXISTS (SELECT /*+
INDEX ("A2" ???) */ NULL "NULL" FROM "T"@DBLINK "A2" WHERE "A2"."X"=:B1 AND
"A2"."Y"=:B2)

PL/SQL procedure successfully completed.

Here is an example, tested in Continue reading “Index Hints and Distributed Queries”

EBR – Part 10: Data Dictionary Views for Editioning Views

This is part 10 of a post series about Oracle Edition-Based Redefinition.

Visit the index page for all the parts of the series

Introduction

In a previous post I wrote about Editioning Views and their significant role in online application upgrades using EBR.
In this post we’ll see how editioning views are represented in the data dictionary views.

I’ll discuss only the USER_* views, but everything is true for the corresponding ALL_*, DBA_* and CDB_* views as well

Setup

At this point, after completing three upgrades already (in part 3, part 5 and part 9), our latest edition is V3, and we have the PEOPLE$0 table and two actual instances of the PEOPLE editioning view (one in V2 and one in V3).

Let’s create a new edition โ€“ V4:

-- connected as system
ORA$BASE> create edition v4;

Edition created.

ORA$BASE> grant use on edition v4 to demo_ebr;

Grant succeeded.

In the new edition, let’s change the PEOPLE editioning view and give different aliases to some of the columns:

ORA$BASE> conn demo_ebr
Enter password:
Connected.
ORA$BASE> @set_edition v4

Session altered.

V4> create or replace editioning view people as
  2  select id,
  3         first_name   as given_name,
  4         last_name    as family_name,
  5         phone_number,
  6         eye_color
  7  from   people$0;

View created.

And let’s create also a regular view on top of PEOPLE:

V4> create view brown_eyed_people as
  2  select *
  3  from   people
  4  where  eye_color = 'Brown';

View created.

Note that the BROWN_EYED_PEOPLE regular view references the PEOPLE editioning view. As described and emphasized, only editioning views reference tables; all the other editioned objects (including regular views) should reference editioning views (or other regular views) only.

USER_OBJECTS

In USER_OBJECTS and in USER_OBJECTS_AE both regular views and editioning views appear simply as views (the OBJECT_TYPE column contains the value “VIEW”). Editioning views have no distinctive indication here.

V4>  select object_type,
  2         object_name,
  3         edition_name
  4  from   user_objects
  5  where  object_type in ('TABLE', 'VIEW')
  6  order by 1,2;

OBJECT_TYPE  OBJECT_NAME                    EDITION_NAME
------------ ------------------------------ ------------
TABLE        PEOPLE$0
VIEW         BROWN_EYED_PEOPLE              V4
VIEW         PEOPLE                         V4

V4>  select object_type,
  2         object_name,
  3         edition_name
  4  from   user_objects_ae
  5  where  object_type in ('TABLE', 'VIEW')
  6  order by 1,2,3;

OBJECT_TYPE  OBJECT_NAME                    EDITION_NAME
------------ ------------------------------ ------------
TABLE        PEOPLE$0
VIEW         BROWN_EYED_PEOPLE              V4
VIEW         PEOPLE                         V2
VIEW         PEOPLE                         V3
VIEW         PEOPLE                         V4 

USER_VIEWS

In USER_VIEWS and in USER_VIEWS_AE there is a column โ€“ EDITIONING_VIEW โ€“ that tells whether a view is an editioning one or not:

V4>  select view_name,
  2         editioning_view
  3  from   user_views
  4  order by 1;

VIEW_NAME            EDITIONING_VIEW
-------------------- -----------------
BROWN_EYED_PEOPLE    N
PEOPLE               Y

V4>  select view_name,
  2         editioning_view,
  3         edition_name
  4  from   user_views_ae
  5  order by 1,3;

VIEW_NAME            EDITIONING_VIEW   EDITION_NAME
-------------------- ----------------- ------------
BROWN_EYED_PEOPLE    N                 V4
PEOPLE               Y                 V2
PEOPLE               Y                 V3
PEOPLE               Y                 V4

USER_EDITIONING_VIEWS

There is a dedicated data dictionary view โ€“ USER_EDITIONING_VIEWS โ€“ that contains only the editioning views.
For each editioning view we can see the view name and the name of the table covered by this view. Recall that (in a specific edition) an editioning view covers one and only one table, and a table may be covered by one editioning view at the most.

V4> select * from user_editioning_views;

VIEW_NAME       TABLE_NAME
--------------- ---------------
PEOPLE          PEOPLE$0

And there is a corresponding USER_EDITIONING_VIEWS_AE view:

V4> select * from user_editioning_views_ae;

VIEW_NAME       TABLE_NAME      EDITION_NAME
--------------- --------------- ------------
PEOPLE          PEOPLE$0        V2
PEOPLE          PEOPLE$0        V3
PEOPLE          PEOPLE$0        V4

USER_EDITIONING_VIEW_COLS

There is a dedicated data dictionary view โ€“ USER_EDITIONING_VIEW_COLS โ€“ that contains for every editioning view all its columns and the corresponding projected columns from the covered table.

V4> select *
  2  from user_editioning_view_cols
  3  order by 1,2;

VIEW_NAME       VIEW_COLUMN_ID VIEW_COLUMN_NAME     TABLE_COLUMN_ID TABLE_COLUMN_NAME
--------------- -------------- -------------------- --------------- --------------------
PEOPLE                       1 ID                                 1 ID
PEOPLE                       2 GIVEN_NAME                         2 FIRST_NAME
PEOPLE                       3 FAMILY_NAME                        3 LAST_NAME
PEOPLE                       4 PHONE_NUMBER                       4 PHONE_NUMBER
PEOPLE                       5 EYE_COLOR                          5 EYE_COLOR

There is also a corresponding USER_EDITIONING_VIEW_COLS_AE view. Let’s use it to see how the PEOPLE editioning view has evolved in the last 3 editions:

V4> select v2,v3,v4
  2  from   (select *
  3          from   user_editioning_view_cols_ae
  4          where  view_name = 'PEOPLE')
  5  pivot(
  6    max(view_column_name || nullif(' [' || nullif(table_column_name, view_column_name) || ']', ' []'))
  7    for edition_name in ('V2' as v2, 'V3' as v3, 'V4' as v4)
  8  )
  9  order  by view_column_id;

V2             V3             V4
-------------- -------------- -------------------------
ID             ID             ID
FIRST_NAME     FIRST_NAME     GIVEN_NAME [FIRST_NAME]
LAST_NAME      LAST_NAME      FAMILY_NAME [LAST_NAME]
PHONE_NUMBER   PHONE_NUMBER   PHONE_NUMBER
               EYE_COLOR      EYE_COLOR

For other parts of the EBR series, please visit the index page.

Wrong Results with IOT, Added Column and Secondary Index

Update: I reported about this bug to Oracle, and they fixed it in version 20.1

I found a “wrong results” bug yesterday, easily reproduced in 11g, 12c and 18c.

In short, we may get wrong results under the following circumstances:

  • We have an Index-Organized Table (IOT) with multi-column primary key, populated with rows
  • The table has a secondary index on part of the primary key columns
  • We add another column to the existing IOT
  • We select from the IOT while accessing it via the secondary index

Following is a simple example (also uploaded to livesql.oracle.com).

We create an index-organized table named iot, including the columns x and y.
The primary key is comprised of both columns.

SQL> create table iot (
  2    x number,
  3    y number,
  4    constraint iot_pk primary key (x,y)
  5  ) organization index;

Table created.

We create a secondary index on y:

SQL> create index secondary_idx on iot (y);

Index created.

We populate the table with some rows:

SQL> insert into iot select rownum,mod(rownum,3)
  2  from dual
  3  connect by level<=7;

7 rows created.

SQL> commit;

Commit complete.

Now we add another column, z, to the table, and we set some value in z in one row:

SQL> alter table iot add z number;

Table altered.

SQL> update iot set z=42 where x=1;

1 row updated.

SQL> commit;

Commit complete.

So currently this is the content of the table:

SQL> select * from iot;

         X          Y          Z
---------- ---------- ----------
         1          1         42
         2          2
         3          0
         4          1
         5          2
         6          0
         7          1

7 rows selected.

Note that z contains the value 42 in the first row and null in all the other rows.
But if we access the table via the secondary index โ€“ for example getting all the records where y=1 โ€“ the value of z from the first row “spills over” into the other rows, and we get wrong results:

SQL> set autotrace on explain
SQL> select * from iot where y=1;

         X          Y          Z
---------- ---------- ----------
         1          1         42
         4          1         42
         7          1         42

3 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 177722221

-----------------------------------------------------------------------------------
| Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |               |     3 |   117 |     1   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| IOT_PK        |     3 |   117 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN| SECONDARY_IDX |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("Y"=1)
   2 - access("Y"=1)

Note
-----
   - dynamic sampling used for this statement (level=2)

It seems that the problem is not with the secondary index itself, as we get the correct rows, but something with accessing the table via the secondary index causes the wrong results.
If we hint the query to access the table directly we get the correct results:

SQL> select /*+ index_ffs (iot) */ * from iot where y=1;

         X          Y          Z
---------- ---------- ----------
         1          1         42
         4          1
         7          1

3 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3252171408

-------------------------------------------------------------------------------
| Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |     3 |   117 |     2   (0)| 00:00:01 |
|*  1 |  INDEX FAST FULL SCAN| IOT_PK |     3 |   117 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("Y"=1)

Note
-----
   - dynamic sampling used for this statement (level=2)

EBR – Part 9: Adding a New Column

This is part 9 of a post series about Oracle Edition-Based Redefinition.

Visit the index page for all the parts of the series

Introduction

In part 7 (“Editioning Views”) I introduced our third development use case โ€“ adding a new column that represents a new business logic.

I emphasize the fact it’s a new business logic, because sometimes we add new columns that come instead of existing columns, for replacing an existing business logic. As we’ll see in future posts, such cases require some additional handling in online upgrades.

Now, after part 8 (“The Last Planned Downtime”), in which we converted our schema to use editioning views and made it fully ready for online upgrades, we can implement the requirement in an online way.

The Task

We want to add a new attribute โ€“ Eye Color โ€“ to the Person entity.
For that we need to add a new column โ€“ EYE_COLOR โ€“ to the PEOPLE table, add make the corresponding changes in the PEOPLE_DL and APP_MGR packages (adding an input parameter โ€“ i_eye_color โ€“ to the people_dl.add procedure and changing the implementation of people_dl.add and app_mgr.do_something accordingly).

The Current State

At this point, V2 is our latest edition and it is exposed to the production clients. When connected to V2, we see the following objects: Continue reading “EBR – Part 9: Adding a New Column”

EBR – Part 8: The Last Planned Downtime

This is part 8 of a post series about Oracle Edition-Based Redefinition.

Visit the index page for all the parts of the series

Introduction

I concluded the previous post by stating that the application code should never reference tables directly; instead, every table should be covered by an editioning view and the application code should reference the views.

Starting from Scratch?

If you start developing a new system from scratch, I strongly recommend to follow this practice, even if initially there is no requirement for online upgrades. That way, you will be ready for online upgrades from day one, so should such a requirement arise at a later stage you will not need to change anything โ€“ neither your data structures nor your code. The only overhead is maintaining an editioning view for every table, and I’m sure you’ll realize very quickly that this overhead is negligible. As long as downtime during your upgrades is acceptable, you only need to enable your schema owner for editions (a one-time operation, as shown in part 3), and (by default) you will simply use a single edition โ€“ the default one.

Existing Systems

If you have an existing system, and you want to enjoy the benefits of online upgrades and zero downtime, you need to do a one-time conversion of your schema โ€“ to cover all the tables by editioning views. Actually, even if you have an existing system and you do not want to enjoy these benefits, I would still recommend getting ready for online upgrades now. The rationale is the same as in the previous section (“Starting from Scratch?”).

Converting the Demo Schema

Let’s handle first the demo schema I’ve been using in this post series. We currently have the following objects: Continue reading “EBR – Part 8: The Last Planned Downtime”