Index Hints and Distributed Queries

There are two ways to specify indexes in optimizer hints: by the index name, or by the index’s (leading) columns:

The latter is usually preferred, as the writer’s intention is clearer, and it’s immune to changes of the index name.

Note: one can present the opposite argument, that specifying the index name is immune to changes of column names

Recently I’ve used the option that I prefer – specifying the column names and not the index name – in a distributed query, hinting the remote table, and found out that it didn’t work.
The hint in the query that was passed to the remote database contained three question marks instead of the column name that I had specified.

Apparently, this weird replacement of column names by question marks happens quite early in the query execution, during the expansion stage:

SQL> set serveroutput on
SQL> DECLARE
  2      v_clob CLOB;
  3  BEGIN
  4      dbms_utility.expand_sql_text(input_sql_text  => 'select * from dual where exists (select /*+ index (t,(x)) */ null from t@dblink t where  x = :x and y = :y)',
  5                                   output_sql_text => v_clob);
  6      dbms_output.put_line(v_clob);
  7  END;
  8  /
SELECT "A1"."DUMMY" "DUMMY" FROM "SYS"."DUAL" "A1" WHERE  EXISTS (SELECT /*+
INDEX ("A2" ???) */ NULL "NULL" FROM "T"@DBLINK "A2" WHERE "A2"."X"=:B1 AND
"A2"."Y"=:B2)

PL/SQL procedure successfully completed.

Here is an example, tested in 11g, 12c and 18c.
We’ll create a table with two indexes and then execute four distributed queries – two queries for every index (one with a “by index name” hint and the other with a “by column name” hint).
We can see that when hinting by the index name, the remote execution plan uses the requested index. However, when hinting by the column name, we get the same remote execution plan, independently of the index column that was specified.

Setup

SQL> create database link dblink connect to demo identified by demo using 'localhost/...';

Database link created.

SQL> create table t (
  2    x number,
  3    y number,
  4    z number
  5  );

Table created.

SQL>
SQL> create index t_x on t(x);

Index created.

SQL> create index t_y on t(y);

Index created.

SQL> var x number
SQL> var y number

Test 1: hinting T_X using the index name

SQL> select * from dual
  2  where exists (
  3    select /*+ index (t,t_x) */ null
  4    from   t@dblink t
  5    where  x = :x
  6    and    y = :y);

no rows selected
SQL> select * from table(dbms_xplan.display_cursor);

-------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     2 (100)|          |        |      |
|*  1 |  FILTER            |      |       |       |            |          |        |      |
|   2 |   TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |        |      |
|   3 |   REMOTE           |      |       |       |            |          | DBLINK | R->S |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( IS NOT NULL)

Remote SQL Information (identified by operation id):
----------------------------------------------------
   3 - SELECT /*+ INDEX ("A1" "T_X") */ 0 FROM "T" "A1" WHERE "A1"."X"=:1 AND
       "A1"."Y"=:2 (accessing 'DBLINK' )
SQL> select sql_id,child_number,to_char(last_active_time,'hh24:mi:ss') from v$sql
  2  where sql_text like 'SELECT /*+ INDEX ("A1" "T_X") */ 0%';

SQL_ID         CN TO_CHAR(LAST_ACTIVE_TIME
------------- --- ------------------------
cy8yb2a941a00   0 10:43:48
SQL> select * from table(dbms_xplan.display_cursor('cy8yb2a941a00',0));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  cy8yb2a941a00, child number 0
-------------------------------------
SELECT /*+ INDEX ("A1" "T_X") */ 0 FROM "T" "A1" WHERE "A1"."X"=:1 AND
"A1"."Y"=:2

--------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |       |       |     1 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T    |     1 |    26 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T_X  |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------- 

Test 2: hinting T_Y using the index name

SQL> select * from dual
  2  where exists (
  3    select /*+ index (t,t_y) */ null
  4    from   t@dblink t
  5    where  x = :x
  6    and    y = :y);

no rows selected
SQL> select * from table(dbms_xplan.display_cursor);

-------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     2 (100)|          |        |      |
|*  1 |  FILTER            |      |       |       |            |          |        |      |
|   2 |   TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |        |      |
|   3 |   REMOTE           |      |       |       |            |          | DBLINK | R->S |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( IS NOT NULL)

Remote SQL Information (identified by operation id):
----------------------------------------------------
   3 - SELECT /*+ INDEX ("A1" "T_Y") */ 0 FROM "T" "A1" WHERE "A1"."X"=:1 AND
       "A1"."Y"=:2 (accessing 'DBLINK' )
SQL> select sql_id,child_number,to_char(last_active_time,'hh24:mi:ss') from v$sql
  2  where sql_text like 'SELECT /*+ INDEX ("A1" "T_Y") */ 0%';

SQL_ID         CN TO_CHAR(LAST_ACTIVE_TIME
------------- --- ------------------------
4g6tru5214a5a   0 10:44:25
SQL> select * from table(dbms_xplan.display_cursor('4g6tru5214a5a',0));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  4g6tru5214a5a, child number 0
-------------------------------------
SELECT /*+ INDEX ("A1" "T_Y") */ 0 FROM "T" "A1" WHERE "A1"."X"=:1 AND
"A1"."Y"=:2

Plan hash value: 132089688

--------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |       |       |     1 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T    |     1 |    26 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T_Y  |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Test 3: hinting T_X using the column name

SQL> select * from dual
  2  where exists (
  3    select /*+ index (t,(x)) */ null
  4    from   t@dblink t
  5    where  x = :x
  6    and    y = :y);

no rows selected
SQL> select * from table(dbms_xplan.display_cursor);

-------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     2 (100)|          |        |      |
|*  1 |  FILTER            |      |       |       |            |          |        |      |
|   2 |   TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |        |      |
|   3 |   REMOTE           |      |       |       |            |          | DBLINK | R->S |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( IS NOT NULL)

Remote SQL Information (identified by operation id):
----------------------------------------------------
   3 - SELECT /*+ INDEX ("A1" ???) */ 0 FROM "T" "A1" WHERE "A1"."X"=:1 AND
       "A1"."Y"=:2 (accessing 'DBLINK' )
SQL> select sql_id,child_number,to_char(last_active_time,'hh24:mi:ss') from v$sql
  2  where sql_text like 'SELECT /*+ INDEX ("A1" ???) */ 0%';

SQL_ID         CN TO_CHAR(LAST_ACTIVE_TIME
------------- --- ------------------------
5czydhvwxgctm   0 10:44:43

1 row selected.
SQL> select * from table(dbms_xplan.display_cursor('5czydhvwxgctm',0));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  5czydhvwxgctm, child number 0
-------------------------------------
SELECT /*+ INDEX ("A1" ???) */ 0 FROM "T" "A1" WHERE "A1"."X"=:1 AND
"A1"."Y"=:2

Plan hash value: 3665358051

--------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |       |       |     1 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T    |     1 |    26 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T_X  |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Test 4: hinting T_Y using the column name

SQL> select * from dual
  2  where exists (
  3    select /*+ index (t,(y)) */ null
  4    from   t@dblink t
  5    where  x = :x
  6    and    y = :y);

no rows selected
SQL> select * from table(dbms_xplan.display_cursor);

-------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     2 (100)|          |        |      |
|*  1 |  FILTER            |      |       |       |            |          |        |      |
|   2 |   TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |        |      |
|   3 |   REMOTE           |      |       |       |            |          | DBLINK | R->S |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( IS NOT NULL)

Remote SQL Information (identified by operation id):
----------------------------------------------------
   3 - SELECT /*+ INDEX ("A1" ???) */ 0 FROM "T" "A1" WHERE "A1"."X"=:1 AND
       "A1"."Y"=:2 (accessing 'DBLINK' )
SQL> select sql_id,child_number,to_char(last_active_time,'hh24:mi:ss') from v$sql
  2  where sql_text like 'SELECT /*+ INDEX ("A1" ???) */ 0%';

SQL_ID         CN TO_CHAR(LAST_ACTIVE_TIME
------------- --- ------------------------
5czydhvwxgctm   0 10:45:43

1 row selected.
SQL> select * from table(dbms_xplan.display_cursor('5czydhvwxgctm',0));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  5czydhvwxgctm, child number 0
-------------------------------------
SELECT /*+ INDEX ("A1" ???) */ 0 FROM "T" "A1" WHERE "A1"."X"=:1 AND
"A1"."Y"=:2

Plan hash value: 3665358051

--------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |       |       |     1 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T    |     1 |    26 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T_X  |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

4 thoughts on “Index Hints and Distributed Queries”

  1. Hi Oren,

    Very interesting finding / bug 🙂

    Is this behavior still the same if you add the table prefix to the column in the hint, like in INDEX (t, (t.x)) ?

    And, just for the sake of art, since the wrong replacement is done
    by the local database, I wonder what happens if you add a
    DRIVING_SITE hint to the main query, regardless, of course, of any optimization-related logic, like this:

    select /*+ DRIVING_SITE (@sub t) */ 
           * from dual
    where exists (
        select /*+ qb_name(sub) index (t,(x)) */ null
        from   t@dblink t
        where  x = :x
        and    y = :y);
    

    Cheers & Best Regards,
    Iudith

    1. Thanks Iudith.
      Same behavior when specifying (t.x) instead of (x).
      Same behavior when adding the DRIVING_SITE hint (which makes sense, as the question marks appear in the expansion stage of the parser, before the optimizer even starts to think about execution plans).
      Thanks,
      Oren.

  2. Hi Oren,

    Sorry for being a “nudnik”, but I think that there is a mistake in the
    syntax of the index hint:

    It should be INDEX (T (x)) and not INDEX (T,(x)) ,
    that is, *without* the comma after the table name/alias.

    If there are several leading columns specified,
    they also should appear without a separating comma,
    like this: INDEX (T (x y)).

    In fact, if specifying an INDEX hint in the “classic” style,
    using the index name, then a comma is also NOT needed, as by the
    documented syntax: INDEX (T t_x).

    However, here it looks like the parser did however accept the
    comma in INDEX(T, t_x), but removed it in the parsed version:
    making it INDEX (“A1” “T_X”)/

    I would be glad if this could correct the weird behavior 🙂

    Cheers & Best Regards,
    Iudith

    1. The syntax is flexible in this case. Commas are allowed, but they are not mandatory.
      As written in the documentation:

      When tablespec is followed by indexspec in the specification of a hint, a comma separating the table name and index name is permitted but not required. Commas are also permitted, but not required, to separate multiple occurrences of indexspec.

      Anyway, I had tested both cases (with and without commas) before I wrote this post. Same behavior…
      Thanks,
      Oren.

Leave a Reply

Your email address will not be published. Required fields are marked *