GROUPING SETS and COLLECT Don’t Get Along

I’ve recently got reminded that assuming something will work, just because it makes sense, doesn’t mean it will really work.

While reviewing some code a few days ago, I saw a query of the following form:

select 'X='||x, collect(z)
from t
group by x
union all
select 'Y='||y, collect(z)
from t
group by y;

I immediately recommended to convert it to use GROUPING SETS; like this:

select decode(grouping(x), 0, 'X='||x, 'Y='||y),
       collect(z)
from t
group by grouping sets (x,y);

The code will be shorter, more elegant, and probably more efficient. Great, isn’t it?
The only problem is that it doesn’t work 🙁

Let’s create a demo table:

SQL> create table t (
  2    x number,
  3    y number,
  4    z number
  5  );

Table created.

SQL> insert into t (x,y,z)
  2  select mod(rownum, 2),
  3         mod(rownum, 3),
  4         rownum
  5  from dual
  6  connect by level <= 10;

10 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

GROUPING SETS works nicely with most of the aggregate functions…

SQL> select decode(grouping(x), 0, 'X='||x, 'Y='||y) grp_by,
  2         count(*) "CNT",
  3         count(distinct z) "CNT D",
  4         sum(z) "SUM",
  5         avg(z) "AVG",
  6         stddev(z) "STDDEV",
  7         min(z) "MIN",
  8         max(z) "MAX"
  9  from t
 10  group by grouping sets (x,y)
 11  order by 1;

GRP_BY      CNT CNT D  SUM  AVG STDDEV  MIN  MAX
---------- ---- ----- ---- ---- ------ ---- ----
X=0           5     5   30    6   3.16    2   10
X=1           5     5   25    5   3.16    1    9
Y=0           3     3   18    6      3    3    9
Y=1           4     4   22  5.5   3.87    1   10
Y=2           3     3   15    5      3    2    8

5 rows selected.

… but not with the COLLECT aggregate function.
In Oracle 11.2 and 12.1 we get ORA-604 and ORA-907:

SQL> select decode(grouping(x), 0, 'X='||x, 'Y='||y),
  2         collect(z)
  3  from t
  4  group by grouping sets (x,y);
group by grouping sets (x,y)
             *
ERROR at line 4:
ORA-00604: error occurred at recursive SQL level 1
ORA-00907: missing right parenthesis

Looks like a bug.
But in Oracle 12.2 something has changed. No, the bug was not fixed. Instead, it is now officially not supported:

SQL> select decode(grouping(x), 0, 'X='||x, 'Y='||y),
  2         collect(z)
  3  from t
  4  group by grouping sets (x,y);
       collect(z)
       *
ERROR at line 2:
ORA-03001: unimplemented feature

-- tested in 12.2, 18.3 and 19.3

So, at least for now, the original query should remain as is:

SQL> select 'X='||x grp_by, collect(z) coll
  2  from t
  3  group by x
  4  union all
  5  select 'Y='||y, collect(z)
  6  from t
  7  group by y;

GRP_BY     COLL
---------- --------------------------------------------------
X=0        ST00001HGTfH6lTUWkKMCXAmZAQg=(2, 10, 8, 6, 4)
X=1        ST00001HGTfH6lTUWkKMCXAmZAQg=(1, 9, 7, 5, 3)
Y=0        ST00001HGTfH6lTUWkKMCXAmZAQg=(3, 9, 6)
Y=1        ST00001HGTfH6lTUWkKMCXAmZAQg=(1, 10, 7, 4)
Y=2        ST00001HGTfH6lTUWkKMCXAmZAQg=(2, 8, 5)

EXPAND_SQL_TEXT – Much More Than Just Expanding Views

Overview

There are features in Oracle SQL that are implemented by other, older, features. This is a clever way for supporting a new syntax with low efforts and low risk – the Oracle Corp engineers only need to convert the SQL statement with the new syntax to an equivalent statement that uses the old syntax they already support. And Oracle has a perfect place for doing this conversion – the expansion stage in the parsing process.

SQL Expansion

When a SQL statement is processed, it goes through several stages, in this order: parsing, optimization, row source generation, and execution.

Note: Parsing is a confusing term, as many times when we say “parsing” (especially “hard parsing”) we actually mean “parsing + optimization + row source generation”.

The first stage, the parsing, is not too complex (comparing to the optimization stage). Continue reading “EXPAND_SQL_TEXT – Much More Than Just Expanding Views”

EBR – Part 12: Editions and Services

This is part 12 of a post series about Oracle Edition-Based Redefinition.

Visit the index page for all the parts of the series

Introduction

In the previous part of this series I wrote about using the Database Default Edition as a way for exposing new editions when using EBR for online application upgrades. As I wrote there, this is a simple method, but in my opinion it is also an impractical method in most cases. From my experience, the best way to expose new editions is by using services. This gives us high levels of both flexibility and control.

When we create or modify an Oracle service, we can specify the session edition for subsequent database connections using this service.
If our new edition includes changes that require code changes in the client-side (for example, when we make API changes), then we’ll expose the new edition by using a new service.
If the new edition includes only changes that are transparent to the client-side (for example, changes in package bodies only), then we don’t have to create a new service. Instead, we can modify the latest service to use the new edition from now on.

Examples

Let’s see some examples. Continue reading “EBR – Part 12: Editions and Services”

EBR – Part 11: Database-Level Default Edition

This is part 11 of a post series about Oracle Edition-Based Redefinition.

Visit the index page for all the parts of the series

Introduction

As we’ve seen in the previous posts, the process of an online application upgrade, in high level, includes the following steps:

Creating a new edition
Applying all the necessary changes in the privacy of the new, unexposed, edition
Making sure that all the objects in the new edition are valid, and that all the objects that should be actual in the new edition are indeed actual
Exposing the new edition to the clients

How do we expose the new edition?

In the previous examples I did it by explicitly changing the client’s session edition, from the client session itself. But that was just for demo purposes, and it is certainly not the recommended way.
The best way to expose new editions, in my opinion, is by using services. This gives us high levels of both flexibility and control.
I’ll dedicate the next post for exposing new editions via services.
In this post I’ll show another option – Database Default Edition. This method is very simple, but also very limiting, and therefore it’s suitable only for very specific cases.

Database Default Edition

There is always one database-level default edition. We can see it using this query: Continue reading “EBR – Part 11: Database-Level Default Edition”

ENABLE NOVALIDATE – Too Polite?

Onine DDL operations are much more polite than offline DDL operations. They usually wait patiently for transactions that hold resources they need until these transactions end, and they do not block new DML statements.

As I wrote in the past, adding a constraint as Enabled and Validated (which is the default for new constrtaints) is an offline operation, but if we split it into two DDL statements – one for adding the constraint as Enabled and Not Validated and the second for making the constraint Validated – then each of these two separate statements is an online operation.

In this post I’d like to show that the first step – creating the constraint as Enabled and Not Validated – is even “more online” than it seems.

Let’s create some table t and insert one record into it:

One> create table t (
  2    a number,
  3    b number
  4  );

Table created.

One> insert into t(a,b) values (111,-1);

1 row created.

I did not commit or rollback this transaction, so it is still open and it’s locking the table in RX mode.

Now, from another session (note the SQL Prompts “One” and “Two”), I’ll add an Enabled and Not Validated check constraint to the table:

Two> alter table t
  2    add constraint t_b_chk
  3    check (b>0) enable novalidate;

Session Two is blocked now by session One (the wait event is “enq: TX – row lock contention”). Since it is an online operation it just waits, without throwing an ORA-54 error as an offline operation would have done.

But actually, it seems that this wait is unnecessary. The operation has already happened.

Using session One, we can see that the constraint already appears in the data dictionary:

One> select constraint_name,
  2         search_condition,
  3         status,
  4         validated
  5  from user_constraints
  6  where table_name = 'T';
  
CONSTRAINT_NAME SEARCH_CONDITION STATUS     VALIDATED
--------------- ---------------- ---------- -------------
T_B_CHK         B>0              ENABLED    NOT VALIDATED

And if we try now to perform a DML that violates the constraint, we’ll get an error message, because the constraint is already enabled:

One> insert into t(a,b) values (222,-2);
insert into t(a,b) values (222,-2)
*
ERROR at line 1:
ORA-02290: check constraint (DEMO.T_B_CHK) violated

Session Two will be released as soon as session One either commits or rolls back, but it seems that it could have been released before that.
Even if we kill session Two before the transaction in session One ends, it doesn’t really matter, because the constraint has already been created and enabled.

ODC Appreciation Day : Pattern Matching in SQL

Here’s my contribution to the ODC Appreciation Day.

Pattern Matching in SQL, using the MATCH_RECOGNIZE clause, is one of my favorite features, but only recently I’ve used it “for real”.
MATCH_RECOGNIZE allows us to perform enhanced analysis of row sequences, and to detect sequences that match complex patterns.
This feature gave a significant boost to the analytical capabilities of SQL. It enables solving various types of problems in a simpler way than before, in much more elegant ways than before, and many times in much better performance.

I learned this feature back in 2013, just shortly after Oracle 12cR1 – the version that introduced the MATCH_RECOGNIZE clause – had been released.
The first time I talked about it was in November 2013, as part of a full day seminar I gave about Oracle 12c new features for developers.
I have been playing with it ever since – for example here, here and here – and have learned to appreciate its powerful capabilities.
I was really excited while giving a presentation about Pattern Matching at ilOUG Tech Days 2015. The reason for my excitement was that Keith Laker, the product manager of the feature I was just presenting, attended the session.

But I didn’t have a chance to use MATCH_RECOGNIZE in real-life use cases until recently. I did identify many cases where Pattern Matching would perfectly fit, but I couldn’t use it as I was still using Oracle 11g. Once starting using Oracle 12c, I have been enjoying using MATCH_RECOGNIZE for solving real problems in my work (although my first attempt revealed a nasty bug).

In a few weeks from now I’m going to present (a completely new version of) Oracle SQL Pattern Matching Made Easy in BGOUG Autumn Conference and in DOAG 2018. The purpose of this session is to explain the MATCH_RECOGNIZE feature from the basics, step-by-step, with many examples, and by that demonstrating how powerful and useful it is, and at the same time eliminating the fear from what may seem at first as complex syntax. If you’re going to be in one of these conferences, come and say hi 🙂

Here are some good resources for learning about Pattern Matching and its uses:

Index Hints and Distributed Queries

There are two ways to specify indexes in optimizer hints: by the index name, or by the index’s (leading) columns:

The latter is usually preferred, as the writer’s intention is clearer, and it’s immune to changes of the index name.

Note: one can present the opposite argument, that specifying the index name is immune to changes of column names

Recently I’ve used the option that I prefer – specifying the column names and not the index name – in a distributed query, hinting the remote table, and found out that it didn’t work.
The hint in the query that was passed to the remote database contained three question marks instead of the column name that I had specified.

Apparently, this weird replacement of column names by question marks happens quite early in the query execution, during the expansion stage:

SQL> set serveroutput on
SQL> DECLARE
  2      v_clob CLOB;
  3  BEGIN
  4      dbms_utility.expand_sql_text(input_sql_text  => 'select * from dual where exists (select /*+ index (t,(x)) */ null from t@dblink t where  x = :x and y = :y)',
  5                                   output_sql_text => v_clob);
  6      dbms_output.put_line(v_clob);
  7  END;
  8  /
SELECT "A1"."DUMMY" "DUMMY" FROM "SYS"."DUAL" "A1" WHERE  EXISTS (SELECT /*+
INDEX ("A2" ???) */ NULL "NULL" FROM "T"@DBLINK "A2" WHERE "A2"."X"=:B1 AND
"A2"."Y"=:B2)

PL/SQL procedure successfully completed.

Here is an example, tested in Continue reading “Index Hints and Distributed Queries”

Dropping Virtual Columns Causes Unnecessary Invalidation

When you drop a column from a table (or set a column unused), any view that references this column becomes invalid.
As of Oracle 11gR1, where Fine Grained Dependency Tracking was introduced, views that reference other columns of the table, but do not reference the dropped column, should not become invalid. And this is usually true, but this week I discovered a case where it’s not.

When dropping a virtual column, even views that do not reference the dropped column become invalid.

Let’s see an example (tested in 11g, 12c and 18c):

We create a table with 3 columns, one of them is virtual

ORA$BASE> create table t (
  2    column_in_view             number,
  3    regular_column_not_in_view number,
  4    virtual_column_not_in_view number as (42) virtual
  5  );

Table created.

And we create a view that projects only one of the table columns

ORA$BASE> create or replace view v as
  2    select column_in_view from t;

View created.

The view is valid at this point

ORA$BASE> select status from user_objects where object_name = 'V';

STATUS
-------
VALID

Dropping the non-referenced non-virtual column does not invalidate the view (which is a good thing, of course)

ORA$BASE> alter table t drop column regular_column_not_in_view;

Table altered.

ORA$BASE> select status from user_objects where object_name = 'V';

STATUS
-------
VALID

However, dropping the non-referenced virtual column does invalidate the view

ORA$BASE> alter table t drop column virtual_column_not_in_view;

Table altered.

ORA$BASE> select status from user_objects where object_name = 'V';

STATUS
-------
INVALID

And we can see the same behavior when setting the columns unused instead of dropping them:

ORA$BASE> drop table t;

Table dropped.

ORA$BASE> create table t (
  2    column_in_view             number,
  3    regular_column_not_in_view number,
  4    virtual_column_not_in_view number as (42) virtual
  5  );

Table created.

ORA$BASE> create or replace view v as
  2    select column_in_view from t;

View created.

ORA$BASE> select status from user_objects where object_name = 'V';

STATUS
-------
VALID

ORA$BASE> alter table t set unused column regular_column_not_in_view;

Table altered.

ORA$BASE> select status from user_objects where object_name = 'V';

STATUS
-------
VALID

ORA$BASE> alter table t set unused column virtual_column_not_in_view;

Table altered.

ORA$BASE> select status from user_objects where object_name = 'V';

STATUS
-------
INVALID

ORA$BASE>

When using EBR (Edition-Based Redefinition) for online application upgrades, Fine Grained Dependency Tracking is very important. The common practice for dropping a column with no downtime is as follows:

In a new edition: take the column out of the editioning view that covers the table (and remove any references to it from the code)
Expose the new edition
As soon as all the sessions use the new edition: drop the column from the table (or set it unused)

This practice fails when the dropped column is a virtual one, due to the behavior we’ve seen above (which, in my opinion, is a bug). Until this bug is solved, I see two workarounds (which I don’t really like):

Just leave the virtual column there, unexposed to the editioning view
Create a new table (without that column) to replace the existing one (with the necessary migration work, including cross-edition triggers)

I’ll write more about this, and about other invalidation cases, in my EBR series.

EBR – Part 10: Data Dictionary Views for Editioning Views

This is part 10 of a post series about Oracle Edition-Based Redefinition.

Visit the index page for all the parts of the series

Introduction

In a previous post I wrote about Editioning Views and their significant role in online application upgrades using EBR.
In this post we’ll see how editioning views are represented in the data dictionary views.

I’ll discuss only the USER_* views, but everything is true for the corresponding ALL_*, DBA_* and CDB_* views as well

Setup

At this point, after completing three upgrades already (in part 3, part 5 and part 9), our latest edition is V3, and we have the PEOPLE$0 table and two actual instances of the PEOPLE editioning view (one in V2 and one in V3).

Let’s create a new edition – V4:

-- connected as system
ORA$BASE> create edition v4;

Edition created.

ORA$BASE> grant use on edition v4 to demo_ebr;

Grant succeeded.

In the new edition, let’s change the PEOPLE editioning view and give different aliases to some of the columns:

ORA$BASE> conn demo_ebr
Enter password:
Connected.
ORA$BASE> @set_edition v4

Session altered.

V4> create or replace editioning view people as
  2  select id,
  3         first_name   as given_name,
  4         last_name    as family_name,
  5         phone_number,
  6         eye_color
  7  from   people$0;

View created.

And let’s create also a regular view on top of PEOPLE:

V4> create view brown_eyed_people as
  2  select *
  3  from   people
  4  where  eye_color = 'Brown';

View created.

Note that the BROWN_EYED_PEOPLE regular view references the PEOPLE editioning view. As described and emphasized, only editioning views reference tables; all the other editioned objects (including regular views) should reference editioning views (or other regular views) only.

USER_OBJECTS

In USER_OBJECTS and in USER_OBJECTS_AE both regular views and editioning views appear simply as views (the OBJECT_TYPE column contains the value “VIEW”). Editioning views have no distinctive indication here.

V4>  select object_type,
  2         object_name,
  3         edition_name
  4  from   user_objects
  5  where  object_type in ('TABLE', 'VIEW')
  6  order by 1,2;

OBJECT_TYPE  OBJECT_NAME                    EDITION_NAME
------------ ------------------------------ ------------
TABLE        PEOPLE$0
VIEW         BROWN_EYED_PEOPLE              V4
VIEW         PEOPLE                         V4

V4>  select object_type,
  2         object_name,
  3         edition_name
  4  from   user_objects_ae
  5  where  object_type in ('TABLE', 'VIEW')
  6  order by 1,2,3;

OBJECT_TYPE  OBJECT_NAME                    EDITION_NAME
------------ ------------------------------ ------------
TABLE        PEOPLE$0
VIEW         BROWN_EYED_PEOPLE              V4
VIEW         PEOPLE                         V2
VIEW         PEOPLE                         V3
VIEW         PEOPLE                         V4

USER_VIEWS

In USER_VIEWS and in USER_VIEWS_AE there is a column – EDITIONING_VIEW – that tells whether a view is an editioning one or not:

V4>  select view_name,
  2         editioning_view
  3  from   user_views
  4  order by 1;

VIEW_NAME            EDITIONING_VIEW
-------------------- -----------------
BROWN_EYED_PEOPLE    N
PEOPLE               Y

V4>  select view_name,
  2         editioning_view,
  3         edition_name
  4  from   user_views_ae
  5  order by 1,3;

VIEW_NAME            EDITIONING_VIEW   EDITION_NAME
-------------------- ----------------- ------------
BROWN_EYED_PEOPLE    N                 V4
PEOPLE               Y                 V2
PEOPLE               Y                 V3
PEOPLE               Y                 V4

USER_EDITIONING_VIEWS

There is a dedicated data dictionary view – USER_EDITIONING_VIEWS – that contains only the editioning views.
For each editioning view we can see the view name and the name of the table covered by this view. Recall that (in a specific edition) an editioning view covers one and only one table, and a table may be covered by one editioning view at the most.

V4> select * from user_editioning_views;

VIEW_NAME       TABLE_NAME
--------------- ---------------
PEOPLE          PEOPLE$0

And there is a corresponding USER_EDITIONING_VIEWS_AE view:

V4> select * from user_editioning_views_ae;

VIEW_NAME       TABLE_NAME      EDITION_NAME
--------------- --------------- ------------
PEOPLE          PEOPLE$0        V2
PEOPLE          PEOPLE$0        V3
PEOPLE          PEOPLE$0        V4

USER_EDITIONING_VIEW_COLS

There is a dedicated data dictionary view – USER_EDITIONING_VIEW_COLS – that contains for every editioning view all its columns and the corresponding projected columns from the covered table.

V4> select *
  2  from user_editioning_view_cols
  3  order by 1,2;

VIEW_NAME       VIEW_COLUMN_ID VIEW_COLUMN_NAME     TABLE_COLUMN_ID TABLE_COLUMN_NAME
--------------- -------------- -------------------- --------------- --------------------
PEOPLE                       1 ID                                 1 ID
PEOPLE                       2 GIVEN_NAME                         2 FIRST_NAME
PEOPLE                       3 FAMILY_NAME                        3 LAST_NAME
PEOPLE                       4 PHONE_NUMBER                       4 PHONE_NUMBER
PEOPLE                       5 EYE_COLOR                          5 EYE_COLOR

There is also a corresponding USER_EDITIONING_VIEW_COLS_AE view. Let’s use it to see how the PEOPLE editioning view has evolved in the last 3 editions:

V4> select v2,v3,v4
  2  from   (select *
  3          from   user_editioning_view_cols_ae
  4          where  view_name = 'PEOPLE')
  5  pivot(
  6    max(view_column_name || nullif(' [' || nullif(table_column_name, view_column_name) || ']', ' []'))
  7    for edition_name in ('V2' as v2, 'V3' as v3, 'V4' as v4)
  8  )
  9  order  by view_column_id;

V2             V3             V4
-------------- -------------- -------------------------
ID             ID             ID
FIRST_NAME     FIRST_NAME     GIVEN_NAME [FIRST_NAME]
LAST_NAME      LAST_NAME      FAMILY_NAME [LAST_NAME]
PHONE_NUMBER   PHONE_NUMBER   PHONE_NUMBER
               EYE_COLOR      EYE_COLOR

For other parts of the EBR series, please visit the index page.

Wrong Results with IOT, Added Column and Secondary Index

Update: I reported about this bug to Oracle, and they fixed it in version 20.1

I found a “wrong results” bug yesterday, easily reproduced in 11g, 12c and 18c.

In short, we may get wrong results under the following circumstances:

We have an Index-Organized Table (IOT) with multi-column primary key, populated with rows
The table has a secondary index on part of the primary key columns
We add another column to the existing IOT
We select from the IOT while accessing it via the secondary index

Following is a simple example (also uploaded to livesql.oracle.com).

We create an index-organized table named iot, including the columns x and y.
The primary key is comprised of both columns.

SQL> create table iot (
  2    x number,
  3    y number,
  4    constraint iot_pk primary key (x,y)
  5  ) organization index;

Table created.

We create a secondary index on y:

SQL> create index secondary_idx on iot (y);

Index created.

We populate the table with some rows:

SQL> insert into iot select rownum,mod(rownum,3)
  2  from dual
  3  connect by level<=7;

7 rows created.

SQL> commit;

Commit complete.

Now we add another column, z, to the table, and we set some value in z in one row:

SQL> alter table iot add z number;

Table altered.

SQL> update iot set z=42 where x=1;

1 row updated.

SQL> commit;

Commit complete.

So currently this is the content of the table:

SQL> select * from iot;

         X          Y          Z
---------- ---------- ----------
         1          1         42
         2          2
         3          0
         4          1
         5          2
         6          0
         7          1

7 rows selected.

Note that z contains the value 42 in the first row and null in all the other rows.
But if we access the table via the secondary index – for example getting all the records where y=1 – the value of z from the first row “spills over” into the other rows, and we get wrong results:

SQL> set autotrace on explain
SQL> select * from iot where y=1;

         X          Y          Z
---------- ---------- ----------
         1          1         42
         4          1         42
         7          1         42

3 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 177722221

-----------------------------------------------------------------------------------
| Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |               |     3 |   117 |     1   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| IOT_PK        |     3 |   117 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN| SECONDARY_IDX |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("Y"=1)
   2 - access("Y"=1)

Note
-----
   - dynamic sampling used for this statement (level=2)

It seems that the problem is not with the secondary index itself, as we get the correct rows, but something with accessing the table via the secondary index causes the wrong results.
If we hint the query to access the table directly we get the correct results:

SQL> select /*+ index_ffs (iot) */ * from iot where y=1;

         X          Y          Z
---------- ---------- ----------
         1          1         42
         4          1
         7          1

3 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3252171408

-------------------------------------------------------------------------------
| Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |     3 |   117 |     2   (0)| 00:00:01 |
|*  1 |  INDEX FAST FULL SCAN| IOT_PK |     3 |   117 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("Y"=1)

Note
-----
   - dynamic sampling used for this statement (level=2)