SORT GROUP BY NOSORT ROLLUP

In the previous post we saw that the Oracle Optimizer has a special operation – SORT GROUP BY ROLLUP – for performing several aggregations of the same population in a single scan, as long as the grouping keys are in a “rollup form”; for example:

(a)
(a,b,c)
(a,b,c,d)

If there is an index that its leading part is the same as the rollup grouping key, and that at least one of its columns is defined as NOT NULL, and if the optimizer thinks that cost-wise it is worth it, then the sorting part of the operation is omitted and the operation becomes SORT GROUP BY NOSORT ROLLUP.

Here is an example (executed in Oracle 18.3).

SQL> create table t (
  2    x number not null,
  3    y number,
  4    z number,
  5    w char(1000)
  6  );

Table created.

SQL> insert into t (x,y,z,w)
  2  select mod(rownum, 3),
  3         mod(rownum, 3)+1,
  4         mod(rownum, 3)+2,
  5         rownum
  6  from dual
  7  connect by level <= 1000
  8  order by 1,2,3;

1000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

SQL> create index i on t (x,y,z);

Index created.

SQL> set timing on
SQL> set autotrace on
SQL> select grouping_id(x,y) grp_id,
  2         x,
  3         y,
  4         count(z)
  5  from   t
  6  group  by grouping sets ((x),(x,y));

    GRP_ID          X          Y   COUNT(Z)
---------- ---------- ---------- ----------
         0          0          1        333
         1          0                   333
         0          1          2        334
         1          1                   334
         0          2          3        333
         1          2                   333

6 rows selected.

Elapsed: 00:00:00.05

Execution Plan
----------------------------------------------------------
Plan hash value: 3362344319

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     3 |    27 |     4   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT ROLLUP|      |     3 |    27 |     4   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN           | I    |  1000 |  9000 |     4   (0)| 00:00:01 |
------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          5  consistent gets
          4  physical reads
          0  redo size
        889  bytes sent via SQL*Net to client
        624  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          6  rows processed

Not all GROUPING SETS are created equal

Introduction

I’ve just realized, once again, that the Oracle Optimizer is even smarter than I thought.
The comments (by Iudith Mentzel and Thomas Mautsch) to my previous post, GROUPING SETS and COLLECT don’t get along, made me understand that not all GROUPING SETS were created equal.

The examples in this post are from Oracle 18.3

Extended Aggregation Options

GROUPING SETS, ROLLUP and CUBE are great features that enable us to perform several aggregations on the same population in a single statement, that is shorter, more elegant, and usually more efficient, than the alternative of combining (e.g., using UNION ALL) the results of the individual aggregations.

In my opinion, GROUPING SETS, ROLLUP and CUBE should be in the toolbox of every Oracle developer. If you are not familiar with them, I highly recommend reading Tim Hall‘s article https://oracle-base.com/articles/misc/rollup-cube-grouping-functions-and-grouping-sets.

Under the Hood

We can use GROUPING SETS with many different combinations, but the implementation under the hood may be different for different combinations.
Let’s take, for example, the following two queries. Note that the first query has to perform 4 aggregations (some of them with a composite grouping key), while the second query performs only 2 aggregations (with a single column grouping key).
Try to guess which of the two will run faster… Continue reading “Not all GROUPING SETS are created equal”

GROUPING SETS and COLLECT Don’t Get Along

I’ve recently got reminded that assuming something will work, just because it makes sense, doesn’t mean it will really work.

While reviewing some code a few days ago, I saw a query of the following form:

select 'X='||x, collect(z)
from t
group by x
union all
select 'Y='||y, collect(z)
from t
group by y;

I immediately recommended to convert it to use GROUPING SETS; like this:

select decode(grouping(x), 0, 'X='||x, 'Y='||y),
       collect(z)
from t
group by grouping sets (x,y);

The code will be shorter, more elegant, and probably more efficient. Great, isn’t it?
The only problem is that it doesn’t work 🙁

Let’s create a demo table:

SQL> create table t (
  2    x number,
  3    y number,
  4    z number
  5  );

Table created.

SQL> insert into t (x,y,z)
  2  select mod(rownum, 2),
  3         mod(rownum, 3),
  4         rownum
  5  from dual
  6  connect by level <= 10;

10 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

GROUPING SETS works nicely with most of the aggregate functions…

SQL> select decode(grouping(x), 0, 'X='||x, 'Y='||y) grp_by,
  2         count(*) "CNT",
  3         count(distinct z) "CNT D",
  4         sum(z) "SUM",
  5         avg(z) "AVG",
  6         stddev(z) "STDDEV",
  7         min(z) "MIN",
  8         max(z) "MAX"
  9  from t
 10  group by grouping sets (x,y)
 11  order by 1;

GRP_BY      CNT CNT D  SUM  AVG STDDEV  MIN  MAX
---------- ---- ----- ---- ---- ------ ---- ----
X=0           5     5   30    6   3.16    2   10
X=1           5     5   25    5   3.16    1    9
Y=0           3     3   18    6      3    3    9
Y=1           4     4   22  5.5   3.87    1   10
Y=2           3     3   15    5      3    2    8

5 rows selected.

… but not with the COLLECT aggregate function.
In Oracle 11.2 and 12.1 we get ORA-604 and ORA-907:

SQL> select decode(grouping(x), 0, 'X='||x, 'Y='||y),
  2         collect(z)
  3  from t
  4  group by grouping sets (x,y);
group by grouping sets (x,y)
             *
ERROR at line 4:
ORA-00604: error occurred at recursive SQL level 1
ORA-00907: missing right parenthesis

Looks like a bug.
But in Oracle 12.2 something has changed. No, the bug was not fixed. Instead, it is now officially not supported:

SQL> select decode(grouping(x), 0, 'X='||x, 'Y='||y),
  2         collect(z)
  3  from t
  4  group by grouping sets (x,y);
       collect(z)
       *
ERROR at line 2:
ORA-03001: unimplemented feature

-- tested in 12.2, 18.3 and 19.3

So, at least for now, the original query should remain as is:

SQL> select 'X='||x grp_by, collect(z) coll
  2  from t
  3  group by x
  4  union all
  5  select 'Y='||y, collect(z)
  6  from t
  7  group by y;

GRP_BY     COLL
---------- --------------------------------------------------
X=0        ST00001HGTfH6lTUWkKMCXAmZAQg=(2, 10, 8, 6, 4)
X=1        ST00001HGTfH6lTUWkKMCXAmZAQg=(1, 9, 7, 5, 3)
Y=0        ST00001HGTfH6lTUWkKMCXAmZAQg=(3, 9, 6)
Y=1        ST00001HGTfH6lTUWkKMCXAmZAQg=(1, 10, 7, 4)
Y=2        ST00001HGTfH6lTUWkKMCXAmZAQg=(2, 8, 5)

EXPAND_SQL_TEXT – Much More Than Just Expanding Views

Overview

There are features in Oracle SQL that are implemented by other, older, features. This is a clever way for supporting a new syntax with low efforts and low risk – the Oracle Corp engineers only need to convert the SQL statement with the new syntax to an equivalent statement that uses the old syntax they already support. And Oracle has a perfect place for doing this conversion – the expansion stage in the parsing process.

SQL Expansion

When a SQL statement is processed, it goes through several stages, in this order: parsing, optimization, row source generation, and execution.

Note: Parsing is a confusing term, as many times when we say “parsing” (especially “hard parsing”) we actually mean “parsing + optimization + row source generation”.

The first stage, the parsing, is not too complex (comparing to the optimization stage). Continue reading “EXPAND_SQL_TEXT – Much More Than Just Expanding Views”

EBR – Part 12: Editions and Services

This is part 12 of a post series about Oracle Edition-Based Redefinition.

Visit the index page for all the parts of the series

Introduction

In the previous part of this series I wrote about using the Database Default Edition as a way for exposing new editions when using EBR for online application upgrades. As I wrote there, this is a simple method, but in my opinion it is also an impractical method in most cases. From my experience, the best way to expose new editions is by using services. This gives us high levels of both flexibility and control.

When we create or modify an Oracle service, we can specify the session edition for subsequent database connections using this service.
If our new edition includes changes that require code changes in the client-side (for example, when we make API changes), then we’ll expose the new edition by using a new service.
If the new edition includes only changes that are transparent to the client-side (for example, changes in package bodies only), then we don’t have to create a new service. Instead, we can modify the latest service to use the new edition from now on.

Examples

Let’s see some examples. Continue reading “EBR – Part 12: Editions and Services”

EBR – Part 11: Database-Level Default Edition

This is part 11 of a post series about Oracle Edition-Based Redefinition.

Visit the index page for all the parts of the series

Introduction

As we’ve seen in the previous posts, the process of an online application upgrade, in high level, includes the following steps:

  • Creating a new edition
  • Applying all the necessary changes in the privacy of the new, unexposed, edition
  • Making sure that all the objects in the new edition are valid, and that all the objects that should be actual in the new edition are indeed actual
  • Exposing the new edition to the clients

How do we expose the new edition?

In the previous examples I did it by explicitly changing the client’s session edition, from the client session itself. But that was just for demo purposes, and it is certainly not the recommended way.
The best way to expose new editions, in my opinion, is by using services. This gives us high levels of both flexibility and control.
I’ll dedicate the next post for exposing new editions via services.
In this post I’ll show another option – Database Default Edition. This method is very simple, but also very limiting, and therefore it’s suitable only for very specific cases.

Database Default Edition

There is always one database-level default edition. We can see it using this query: Continue reading “EBR – Part 11: Database-Level Default Edition”

ENABLE NOVALIDATE – Too Polite?

Onine DDL operations are much more polite than offline DDL operations. They usually wait patiently for transactions that hold resources they need until these transactions end, and they do not block new DML statements.

As I wrote in the past, adding a constraint as Enabled and Validated (which is the default for new constrtaints) is an offline operation, but if we split it into two DDL statements – one for adding the constraint as Enabled and Not Validated and the second for making the constraint Validated – then each of these two separate statements is an online operation.

In this post I’d like to show that the first step – creating the constraint as Enabled and Not Validated – is even “more online” than it seems.

Let’s create some table t and insert one record into it:

One> create table t (
  2    a number,
  3    b number
  4  );

Table created.

One> insert into t(a,b) values (111,-1);

1 row created.

I did not commit or rollback this transaction, so it is still open and it’s locking the table in RX mode.

Now, from another session (note the SQL Prompts “One” and “Two”), I’ll add an Enabled and Not Validated check constraint to the table:

Two> alter table t
  2    add constraint t_b_chk
  3    check (b>0) enable novalidate;

Session Two is blocked now by session One (the wait event is “enq: TX – row lock contention”). Since it is an online operation it just waits, without throwing an ORA-54 error as an offline operation would have done.

But actually, it seems that this wait is unnecessary. The operation has already happened.

Using session One, we can see that the constraint already appears in the data dictionary:

One> select constraint_name,
  2         search_condition,
  3         status,
  4         validated
  5  from user_constraints
  6  where table_name = 'T';
  
CONSTRAINT_NAME SEARCH_CONDITION STATUS     VALIDATED
--------------- ---------------- ---------- -------------
T_B_CHK         B>0              ENABLED    NOT VALIDATED

And if we try now to perform a DML that violates the constraint, we’ll get an error message, because the constraint is already enabled:

One> insert into t(a,b) values (222,-2);
insert into t(a,b) values (222,-2)
*
ERROR at line 1:
ORA-02290: check constraint (DEMO.T_B_CHK) violated

Session Two will be released as soon as session One either commits or rolls back, but it seems that it could have been released before that.
Even if we kill session Two before the transaction in session One ends, it doesn’t really matter, because the constraint has already been created and enabled.

ODC Appreciation Day : Pattern Matching in SQL

Here’s my contribution to the ODC Appreciation Day.

Pattern Matching in SQL, using the MATCH_RECOGNIZE clause, is one of my favorite features, but only recently I’ve used it “for real”.
MATCH_RECOGNIZE allows us to perform enhanced analysis of row sequences, and to detect sequences that match complex patterns.
This feature gave a significant boost to the analytical capabilities of SQL. It enables solving various types of problems in a simpler way than before, in much more elegant ways than before, and many times in much better performance.

I learned this feature back in 2013, just shortly after Oracle 12cR1 – the version that introduced the MATCH_RECOGNIZE clause – had been released.
The first time I talked about it was in November 2013, as part of a full day seminar I gave about Oracle 12c new features for developers.
I have been playing with it ever since – for example here, here and here – and have learned to appreciate its powerful capabilities.
I was really excited while giving a presentation about Pattern Matching at ilOUG Tech Days 2015. The reason for my excitement was that Keith Laker, the product manager of the feature I was just presenting, attended the session.

But I didn’t have a chance to use MATCH_RECOGNIZE in real-life use cases until recently. I did identify many cases where Pattern Matching would perfectly fit, but I couldn’t use it as I was still using Oracle 11g. Once starting using Oracle 12c, I have been enjoying using MATCH_RECOGNIZE for solving real problems in my work (although my first attempt revealed a nasty bug).

In a few weeks from now I’m going to present (a completely new version of) Oracle SQL Pattern Matching Made Easy in BGOUG Autumn Conference and in DOAG 2018. The purpose of this session is to explain the MATCH_RECOGNIZE feature from the basics, step-by-step, with many examples, and by that demonstrating how powerful and useful it is, and at the same time eliminating the fear from what may seem at first as complex syntax. If you’re going to be in one of these conferences, come and say hi 🙂

Here are some good resources for learning about Pattern Matching and its uses:

Index Hints and Distributed Queries

There are two ways to specify indexes in optimizer hints: by the index name, or by the index’s (leading) columns:

The latter is usually preferred, as the writer’s intention is clearer, and it’s immune to changes of the index name.

Note: one can present the opposite argument, that specifying the index name is immune to changes of column names

Recently I’ve used the option that I prefer – specifying the column names and not the index name – in a distributed query, hinting the remote table, and found out that it didn’t work.
The hint in the query that was passed to the remote database contained three question marks instead of the column name that I had specified.

Apparently, this weird replacement of column names by question marks happens quite early in the query execution, during the expansion stage:

SQL> set serveroutput on
SQL> DECLARE
  2      v_clob CLOB;
  3  BEGIN
  4      dbms_utility.expand_sql_text(input_sql_text  => 'select * from dual where exists (select /*+ index (t,(x)) */ null from t@dblink t where  x = :x and y = :y)',
  5                                   output_sql_text => v_clob);
  6      dbms_output.put_line(v_clob);
  7  END;
  8  /
SELECT "A1"."DUMMY" "DUMMY" FROM "SYS"."DUAL" "A1" WHERE  EXISTS (SELECT /*+
INDEX ("A2" ???) */ NULL "NULL" FROM "T"@DBLINK "A2" WHERE "A2"."X"=:B1 AND
"A2"."Y"=:B2)

PL/SQL procedure successfully completed.

Here is an example, tested in Continue reading “Index Hints and Distributed Queries”

Dropping Virtual Columns Causes Unnecessary Invalidation

When you drop a column from a table (or set a column unused), any view that references this column becomes invalid.
As of Oracle 11gR1, where Fine Grained Dependency Tracking was introduced, views that reference other columns of the table, but do not reference the dropped column, should not become invalid. And this is usually true, but this week I discovered a case where it’s not.

When dropping a virtual column, even views that do not reference the dropped column become invalid.

Let’s see an example (tested in 11g, 12c and 18c):

We create a table with 3 columns, one of them is virtual

ORA$BASE> create table t (
  2    column_in_view             number,
  3    regular_column_not_in_view number,
  4    virtual_column_not_in_view number as (42) virtual
  5  );

Table created.

And we create a view that projects only one of the table columns

ORA$BASE> create or replace view v as
  2    select column_in_view from t;

View created.

The view is valid at this point

ORA$BASE> select status from user_objects where object_name = 'V';

STATUS
-------
VALID

Dropping the non-referenced non-virtual column does not invalidate the view (which is a good thing, of course)

ORA$BASE> alter table t drop column regular_column_not_in_view;

Table altered.

ORA$BASE> select status from user_objects where object_name = 'V';

STATUS
-------
VALID

However, dropping the non-referenced virtual column does invalidate the view

ORA$BASE> alter table t drop column virtual_column_not_in_view;

Table altered.

ORA$BASE> select status from user_objects where object_name = 'V';

STATUS
-------
INVALID

And we can see the same behavior when setting the columns unused instead of dropping them:

ORA$BASE> drop table t;

Table dropped.

ORA$BASE> create table t (
  2    column_in_view             number,
  3    regular_column_not_in_view number,
  4    virtual_column_not_in_view number as (42) virtual
  5  );

Table created.

ORA$BASE> create or replace view v as
  2    select column_in_view from t;

View created.

ORA$BASE> select status from user_objects where object_name = 'V';

STATUS
-------
VALID

ORA$BASE> alter table t set unused column regular_column_not_in_view;

Table altered.

ORA$BASE> select status from user_objects where object_name = 'V';

STATUS
-------
VALID

ORA$BASE> alter table t set unused column virtual_column_not_in_view;

Table altered.

ORA$BASE> select status from user_objects where object_name = 'V';

STATUS
-------
INVALID

ORA$BASE>

When using EBR (Edition-Based Redefinition) for online application upgrades, Fine Grained Dependency Tracking is very important. The common practice for dropping a column with no downtime is as follows:

  • In a new edition: take the column out of the editioning view that covers the table (and remove any references to it from the code)
  • Expose the new edition
  • As soon as all the sessions use the new edition: drop the column from the table (or set it unused)

This practice fails when the dropped column is a virtual one, due to the behavior we’ve seen above (which, in my opinion, is a bug). Until this bug is solved, I see two workarounds (which I don’t really like):

  • Just leave the virtual column there, unexposed to the editioning view
  • Create a new table (without that column) to replace the existing one (with the necessary migration work, including cross-edition triggers)

I’ll write more about this, and about other invalidation cases, in my EBR series.