Optimization of NOT NULL Constraint Creation

Several years ago I wrote a series of 8 posts about constraint creation optimization. I think it’s time to add some more posts to the series.
I showed there, among other things, that Oracle does a nice optimization – that may save plenty of time – when we add in a single ALTER TABLE statement a new (nullable with no default value) column and a corresponding:
– inline (a.k.a. “column-level”) check constraint
or
– foreign key constraint (either inline or out-of-line)

What about NOT NULL constraints?

Oracle has a special optimization for the case of adding a column with a NOT NULL constraint (and no default value).
It checks if the table contains at least one record. If it does, then we cannot add the new column, becuase the NOT NULL constraint will be violated for all the existing rows. If the table doesn’t contain any records, the new column can be added.

Let’s start with an empty table: Continue reading “Optimization of NOT NULL Constraint Creation”

SORT GROUP BY NOSORT ROLLUP

In the previous post we saw that the Oracle Optimizer has a special operation – SORT GROUP BY ROLLUP – for performing several aggregations of the same population in a single scan, as long as the grouping keys are in a “rollup form”; for example:

(a)
(a,b,c)
(a,b,c,d)

If there is an index that its leading part is the same as the rollup grouping key, and that at least one of its columns is defined as NOT NULL, and if the optimizer thinks that cost-wise it is worth it, then the sorting part of the operation is omitted and the operation becomes SORT GROUP BY NOSORT ROLLUP.

Here is an example (executed in Oracle 18.3).

SQL> create table t (
  2    x number not null,
  3    y number,
  4    z number,
  5    w char(1000)
  6  );

Table created.

SQL> insert into t (x,y,z,w)
  2  select mod(rownum, 3),
  3         mod(rownum, 3)+1,
  4         mod(rownum, 3)+2,
  5         rownum
  6  from dual
  7  connect by level <= 1000
  8  order by 1,2,3;

1000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

SQL> create index i on t (x,y,z);

Index created.

SQL> set timing on
SQL> set autotrace on
SQL> select grouping_id(x,y) grp_id,
  2         x,
  3         y,
  4         count(z)
  5  from   t
  6  group  by grouping sets ((x),(x,y));

    GRP_ID          X          Y   COUNT(Z)
---------- ---------- ---------- ----------
         0          0          1        333
         1          0                   333
         0          1          2        334
         1          1                   334
         0          2          3        333
         1          2                   333

6 rows selected.

Elapsed: 00:00:00.05

Execution Plan
----------------------------------------------------------
Plan hash value: 3362344319

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     3 |    27 |     4   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT ROLLUP|      |     3 |    27 |     4   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN           | I    |  1000 |  9000 |     4   (0)| 00:00:01 |
------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          5  consistent gets
          4  physical reads
          0  redo size
        889  bytes sent via SQL*Net to client
        624  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          6  rows processed

Not all GROUPING SETS are created equal

Introduction

I’ve just realized, once again, that the Oracle Optimizer is even smarter than I thought.
The comments (by Iudith Mentzel and Thomas Mautsch) to my previous post, GROUPING SETS and COLLECT don’t get along, made me understand that not all GROUPING SETS were created equal.

The examples in this post are from Oracle 18.3

Extended Aggregation Options

GROUPING SETS, ROLLUP and CUBE are great features that enable us to perform several aggregations on the same population in a single statement, that is shorter, more elegant, and usually more efficient, than the alternative of combining (e.g., using UNION ALL) the results of the individual aggregations.

In my opinion, GROUPING SETS, ROLLUP and CUBE should be in the toolbox of every Oracle developer. If you are not familiar with them, I highly recommend reading Tim Hall‘s article https://oracle-base.com/articles/misc/rollup-cube-grouping-functions-and-grouping-sets.

Under the Hood

We can use GROUPING SETS with many different combinations, but the implementation under the hood may be different for different combinations.
Let’s take, for example, the following two queries. Note that the first query has to perform 4 aggregations (some of them with a composite grouping key), while the second query performs only 2 aggregations (with a single column grouping key).
Try to guess which of the two will run faster… Continue reading “Not all GROUPING SETS are created equal”

EXPAND_SQL_TEXT – Much More Than Just Expanding Views

Overview

There are features in Oracle SQL that are implemented by other, older, features. This is a clever way for supporting a new syntax with low efforts and low risk – the Oracle Corp engineers only need to convert the SQL statement with the new syntax to an equivalent statement that uses the old syntax they already support. And Oracle has a perfect place for doing this conversion – the expansion stage in the parsing process.

SQL Expansion

When a SQL statement is processed, it goes through several stages, in this order: parsing, optimization, row source generation, and execution.

Note: Parsing is a confusing term, as many times when we say “parsing” (especially “hard parsing”) we actually mean “parsing + optimization + row source generation”.

The first stage, the parsing, is not too complex (comparing to the optimization stage). Continue reading “EXPAND_SQL_TEXT – Much More Than Just Expanding Views”

Index Hints and Distributed Queries

There are two ways to specify indexes in optimizer hints: by the index name, or by the index’s (leading) columns:

The latter is usually preferred, as the writer’s intention is clearer, and it’s immune to changes of the index name.

Note: one can present the opposite argument, that specifying the index name is immune to changes of column names

Recently I’ve used the option that I prefer – specifying the column names and not the index name – in a distributed query, hinting the remote table, and found out that it didn’t work.
The hint in the query that was passed to the remote database contained three question marks instead of the column name that I had specified.

Apparently, this weird replacement of column names by question marks happens quite early in the query execution, during the expansion stage:

SQL> set serveroutput on
SQL> DECLARE
  2      v_clob CLOB;
  3  BEGIN
  4      dbms_utility.expand_sql_text(input_sql_text  => 'select * from dual where exists (select /*+ index (t,(x)) */ null from t@dblink t where  x = :x and y = :y)',
  5                                   output_sql_text => v_clob);
  6      dbms_output.put_line(v_clob);
  7  END;
  8  /
SELECT "A1"."DUMMY" "DUMMY" FROM "SYS"."DUAL" "A1" WHERE  EXISTS (SELECT /*+
INDEX ("A2" ???) */ NULL "NULL" FROM "T"@DBLINK "A2" WHERE "A2"."X"=:B1 AND
"A2"."Y"=:B2)

PL/SQL procedure successfully completed.

Here is an example, tested in Continue reading “Index Hints and Distributed Queries”

The Performance of the FIRST and LAST Functions

Overview

One of the first posts I wrote in this blog (almost five years ago) was about the FIRST and LAST aggregate functions.
These functions are, in a way, extended versions of the much more popular aggregate functions MIN and MAX.
MIN and MAX allow you (conceptually) to sort a group of rows by some column and return the value of that column from the first or last row in the sorted group.
The FIRST and LAST functions extend this ability, and allow you to sort a group of rows by one column, but return the value of another column from the first or last row in the sorted group.
You are welcome to read the original post for more details about the functionality and syntax of these functions, and for seeing some examples.

In this post I’d like to focus on the performance of the FIRST and LAST functions.
Continue reading “The Performance of the FIRST and LAST Functions”

A Single Query with Many Filter Combinations – Part 3

In a recent post I suggested a way to write a single SQL query that filters a table by one or more of several columns.
Here is the query from that post:

select * from employees where rowid in (
  select rid from (
    select rowid rid from employees where department_id = :department_id
    union all
    select rowid rid from employees where job_id = :job_id
    union all
    select rowid rid from employees where manager_id = :manager_id
    union all
    select rowid rid from employees where last_name = :last_name)
  group by rid
  having count(*) = nvl2(:department_id,1,0) + 
                    nvl2(:job_id,1,0) + 
                    nvl2(:manager_id,1,0) + 
                    nvl2(:last_name,1,0)
);

This is actually a simplification of a real task I handled some time ago.
In today’s post I’d like to look at a variation of the problem, which is closer to the real issue I handled. Continue reading “A Single Query with Many Filter Combinations – Part 3”

A Single Query with Many Filter Combinations – Part 2

In the previous post I suggested a way to write a single SQL query that filters the EMPLOYEES table (as an example) by one or more of the following columns: DEPARTMENT_ID, JOB_ID, MANAGER_ID and LAST_NAME.

Here is the query from the previous post:

select * from employees where rowid in (
  select rid from (
    select rowid rid from employees where department_id = :department_id
    union all
    select rowid rid from employees where job_id = :job_id
    union all
    select rowid rid from employees where manager_id = :manager_id
    union all
    select rowid rid from employees where last_name = :last_name)
  group by rid
  having count(*) = nvl2(:department_id,1,0) + 
                    nvl2(:job_id,1,0) + 
                    nvl2(:manager_id,1,0) + 
                    nvl2(:last_name,1,0)
);

In this query, it is assumed that at least one of the bind variables (:department_id, :job_id, :manager_id, :last_name) contains a non-null value.
If all of them contain nulls, then the result set of the query is empty. Note that in this case Oracle will not even try looking for data; as Stew Ashton commented in the previous post, the Index Range Scan will do no consistent gets for a predicate such as “column_name = :bind_variable” when :bind_variable is null (because Oracle knows that no row satisfies a “column=null” condition).

If we want to support filtering by zero or more of the columns, we should add (union) a subquery to handle the case when all the bind variables are nulls: Continue reading “A Single Query with Many Filter Combinations – Part 2”

A Single Query with Many Filter Combinations

Let’s assume the EMPLOYEES table (from the HR schema) contains many records, and we want to write an (efficient) SQL query that filters it by one or more of the following columns: DEPARTMENT_ID, JOB_ID, MANAGER_ID and LAST_NAME.
For example:

  • in one execution we may want to get all the employees with DEPARTMENT_ID = 80
  • in a second execution all the employees that their LAST_NAME is King
  • in a third execution all the employees that their JOB_ID is ST_CLERK and their MANAGER_ID is 124
  • and so on

These columns are indexed, each one in a separate index:

SQL> select index_name,
  2         listagg(column_name, ',') within group(order by column_position) index_columns
  3  from user_ind_columns
  4  where table_name = 'EMPLOYEES'
  5  group by index_name;

INDEX_NAME           INDEX_COLUMNS
-------------------- --------------------
EMP_DEPARTMENT_IX    DEPARTMENT_ID
EMP_EMAIL_UK         EMAIL
EMP_EMP_ID_PK        EMPLOYEE_ID
EMP_JOB_IX           JOB_ID
EMP_MANAGER_IX       MANAGER_ID
EMP_NAME_IX          LAST_NAME,FIRST_NAME

6 rows selected.

Many Queries, Many Indexes

We can write 15 different queries – a query for every possible combination. Continue reading “A Single Query with Many Filter Combinations”

Adding a Unique Constraint in an Online Way

Note: unlike most of my posts, this one assumes using Enterprise Edition

I have a table t and I want to add a unique constraint on one of its columns – c1.

The Offline Way

The straightforward and most simple way to do it is using a single alter table statement:

SQL> alter table t add constraint c1_uk unique (c1);

Table altered.

By default, Oracle creates in this operation a unique constraint (named c1_uk) and a corresponding unique index (named c1_uk as well) that enforces the constraint.
The downside is that this is an offline operation – the table is locked in Share mode.
This is true even if we specify that the creation of the index is online:

SQL> alter table t add constraint c1_uk unique (c1) using index online;

Table altered.

If the table contains many records, the creation of the index may take a significant amount of time, during which the table is locked and DML operations on the table are blocked.

The Online Way

We can create the unique constraint in an online way, by splitting the operation into three steps: Continue reading “Adding a Unique Constraint in an Online Way”