Overview
There are features in Oracle SQL that are implemented by other, older, features. This is a clever way for supporting a new syntax with low efforts and low risk – the Oracle Corp engineers only need to convert the SQL statement with the new syntax to an equivalent statement that uses the old syntax they already support. And Oracle has a perfect place for doing this conversion – the expansion stage in the parsing process.
SQL Expansion
When a SQL statement is processed, it goes through several stages, in this order: parsing, optimization, row source generation, and execution.
Note: Parsing is a confusing term, as many times when we say “parsing” (especially “hard parsing”) we actually mean “parsing + optimization + row source generation”.
The first stage, the parsing, is not too complex (comparing to the optimization stage). The Parser basically checks the syntax and the semantics of the statement. If needed, it also expands the statement. For example, it replaces each view referenced in the statement with its definition, so after parsing the statement refers only to actual tables. Another example: it expands * to the actual column list.
EXPAND_SQL_TEXT
As of Oracle 12c we have a simple and documented way to see the output of the expansion that is done by the Parser – using the DBMS_UTILITY.EXPAND_SQL_TEXT procedure. In 11g this procedure was included in another package – DBMS_SQL2 – but was undocumented.
Note: to be more precise, I assume that this procedure reveals everything that the Parser does during the expansion stage, and only that. The documentation of DBMS_UTILITY.EXPAND_SQL_TEXT is very limited. It only says “This procedure recursively replaces any view references in the input SQL query with the corresponding view subquery”, and the Usage Notes imply that it also shows the outcome of applying VPD policies.
Examples
Let’s see some examples for such query expansions:
Views
View expansion is the classic use case that is used for demonstrating EXPAND_SQL_TEXT. You can see examples by Tim Hall and Maria Colgan in the following links:
https://oracle-base.com/articles/12c/expand-sql-text-12cr1
https://sqlmaria.com/2018/03/13/how-to-determine-which-view-to-use/
Pivot
The Pivot operator was added in Oracle 11g for writing cross-tabulation queries that rotate rows into columns. During the rotation process, an implicit aggregation is performed.
Let’s find, for example, the number of employees and their average salary per department, in departments 10, 20 and 30:
SQL> select * 2 from (select job_id, department_id, salary 3 from hr.employees e 4 where department_id <= 30) 5 pivot ( 6 count(*) as employees, 7 avg(salary) as avg_sal 8 for department_id in (10, 20, 30) 9 ); JOB_ID 10_EMPLOYEES 10_AVG_SAL 20_EMPLOYEES 20_AVG_SAL 30_EMPLOYEES 30_AVG_SAL ---------- ------------ ---------- ------------ ---------- ------------ ---------- PU_CLERK 0 0 5 2780 PU_MAN 0 0 1 11000 AD_ASST 1 4400 0 0 MK_MAN 0 1 13000 0 MK_REP 0 1 6000 0
We can use EXPAND_SQL_TEXT to see how the implicit aggregation happens exactly:
SQL> var x clob SQL> begin 2 dbms_utility.expand_sql_text( 3 input_sql_text => ' 4 select * 5 from (select job_id, department_id, salary 6 from hr.employees e 7 where department_id <= 30) 8 pivot ( 9 count(*) as employees, 10 avg(salary) as avg_sal 11 for department_id in (10, 20, 30) 12 )', 13 output_sql_text => :x); 14 end; 15 / PL/SQL procedure successfully completed. print x -- I formatted the output to make it more readable SELECT "A1"."JOB_ID" "JOB_ID", "A1"."10_EMPLOYEES" "10_EMPLOYEES", "A1"."10_AVG_SAL" "10_AVG_SAL", "A1"."20_EMPLOYEES" "20_EMPLOYEES", "A1"."20_AVG_SAL" "20_AVG_SAL", "A1"."30_EMPLOYEES" "30_EMPLOYEES", "A1"."30_AVG_SAL" "30_AVG_SAL" FROM (SELECT "A2"."JOB_ID" "JOB_ID", NVL(SUM(CASE WHEN ("A2"."DEPARTMENT_ID" = 10) THEN 1 END), 0) "10_EMPLOYEES", AVG(CASE WHEN ("A2"."DEPARTMENT_ID" = 10) THEN "A2"."SALARY" END) "10_AVG_SAL", NVL(SUM(CASE WHEN ("A2"."DEPARTMENT_ID" = 20) THEN 1 END), 0) "20_EMPLOYEES", AVG(CASE WHEN ("A2"."DEPARTMENT_ID" = 20) THEN "A2"."SALARY" END) "20_AVG_SAL", NVL(SUM(CASE WHEN ("A2"."DEPARTMENT_ID" = 30) THEN 1 END), 0) "30_EMPLOYEES", AVG(CASE WHEN ("A2"."DEPARTMENT_ID" = 30) THEN "A2"."SALARY" END) "30_AVG_SAL" FROM (SELECT "A3"."JOB_ID" "JOB_ID", "A3"."DEPARTMENT_ID" "DEPARTMENT_ID", "A3"."SALARY" "SALARY" FROM "HR"."EMPLOYEES" "A3" WHERE "A3"."DEPARTMENT_ID" <= 30) "A2" GROUP BY "A2"."JOB_ID") "A1"
Row Limiting
The Row Limiting clause was added in Oracle 12c for simplifying Top-N queries and pagination in SQL. I dedicated a post for this feature as part of the Write Less With More series, and you can see there that Oracle actually translates the new Row Limiting clause into conditions with analytic functions – ROW_NUMBER for the ONLY option and RANK for the WITH TIES option:
https://db-oriented.com/2015/10/23/write-less-with-more-part-5/
Joins
Most of the joins that are written using the ANSI syntax are converted to the proprietary Oracle syntax.
Let’s see for example a query that uses the OUTER APPLY syntax (supported in Oracle since 12c).
SQL> create table t ( 2 id integer, 3 occurrences integer, 4 instantiate char(1) 5 ); Table created. SQL> SQL> insert into t values (123,3,'Y'); 1 row created. SQL> insert into t values (345,1,'Y'); 1 row created. SQL> insert into t values (567,5,'N'); 1 row created. SQL> insert into t values (789,2,'Y'); 1 row created.
The following query returns all the records from t. For records with instantiate=Y it instantiates several records, as specified by the occurrences column.
SQL> select t.*,
2 r.instance#
3 from t
4 outer apply (select rownum instance#
5 from dual
6 where t.instantiate = 'Y'
7 connect by level <= t.occurrences) r;
ID OCCURRENCES INS INSTANCE#
---------- ----------- --- ----------
123 3 Y 1
123 3 Y 2
123 3 Y 3
345 1 Y 1
567 5 N
789 2 Y 1
789 2 Y 2
7 rows selected.
We can see, using EXPAND_SQL_TEXT, that OUTER APPLY is implemented using lateral inline views and (+), the good old outer join syntax:
SQL> var x clob SQL> begin 2 dbms_utility.expand_sql_text( 3 input_sql_text => q'' 4 select t.*, 5 r.instance# 6 from t 7 outer apply (select rownum instance# 8 from dual 9 where t.instantiate = 'Y' 10 connect by level <= t.occurrences) r'', 11 output_sql_text => :x); 12 end; 13 / PL/SQL procedure successfully completed. print x -- I formatted the output to make it more readable SELECT "A1"."ID_0" "ID", "A1"."OCCURRENCES_1" "OCCURRENCES", "A1"."INSTANTIATE_2" "INSTANTIATE", "A1"."INSTANCE#_3" "INSTANCE#" FROM (SELECT "A3"."ID" "ID_0", "A3"."OCCURRENCES" "OCCURRENCES_1", "A3"."INSTANTIATE" "INSTANTIATE_2", "A2"."INSTANCE#_0" "INSTANCE#_3" FROM "DEMO"."T" "A3", LATERAL((SELECT "A4"."INSTANCE#_0" "INSTANCE#_0" FROM LATERAL((SELECT ROWNUM "INSTANCE#_0" FROM "SYS"."DUAL" "A5" WHERE "A3"."INSTANTIATE" = 'Y' CONNECT BY LEVEL <= "A3"."OCCURRENCES")) "A4" WHERE 1 = 1))(+) "A2") "A1"
Temporal Validity
Temporal Validity was added in Oracle 12c. It allows to apply filtering based on validity period (or range), either explicitly or implicitly.
Explicit filtering is done at the statement-level. Implicit filtering is done by a session-level control.
We can see that both statement-level control and session-level control are implemented at the expansion stage.
SQL> create table project_assignments (
2 assignment_id integer not null constraint project_assignments_pk primary key,
3 person_id integer not null,
4 project_id integer not null
5 );
Table created.
SQL> alter table project_assignments
2 add period for assignment_period;
Table altered.
Let's see first how the explicit filtering (using the as of period clause in the SELECT statement) is implemented:
SQL> var x clob SQL> begin 2 dbms_utility.expand_sql_text( 3 input_sql_text => ' 4 select person_id, 5 project_id, 6 assignment_period_start, 7 assignment_period_end 8 from project_assignments 9 as of period for assignment_period sysdate', 10 output_sql_text => :x); 11 end; 12 / PL/SQL procedure successfully completed. SQL> print x -- I formatted the output to make it more readable SELECT "A1"."PERSON_ID" "PERSON_ID", "A1"."PROJECT_ID" "PROJECT_ID", "A1"."ASSIGNMENT_PERIOD_START" "ASSIGNMENT_PERIOD_START", "A1"."ASSIGNMENT_PERIOD_END" "ASSIGNMENT_PERIOD_END" FROM (SELECT "A2"."ASSIGNMENT_PERIOD_START" "ASSIGNMENT_PERIOD_START", "A2"."ASSIGNMENT_PERIOD_END" "ASSIGNMENT_PERIOD_END", "A2"."ASSIGNMENT_ID" "ASSIGNMENT_ID", "A2"."PERSON_ID" "PERSON_ID", "A2"."PROJECT_ID" "PROJECT_ID" FROM "DEMO5"."PROJECT_ASSIGNMENTS" "A2" WHERE ("A2"."ASSIGNMENT_PERIOD_START" IS NULL OR "A2"."ASSIGNMENT_PERIOD_START" <= SYSDATE) AND ("A2"."ASSIGNMENT_PERIOD_END" IS NULL OR "A2"."ASSIGNMENT_PERIOD_END" > SYSDATE)) "A1"
And now let's see how the implicit filtering (using the session-level control) is implemented:
SQL> exec dbms_flashback_archive.enable_at_valid_time('CURRENT') PL/SQL procedure successfully completed. SQL> begin 2 dbms_utility.expand_sql_text( 3 input_sql_text => ' 4 select person_id, 5 project_id, 6 assignment_period_start, 7 assignment_period_end 8 from project_assignments', 9 output_sql_text => :x); 10 end; 11 / PL/SQL procedure successfully completed. SQL> print x -- I formatted the output to make it more readable SELECT "A1"."PERSON_ID" "PERSON_ID", "A1"."PROJECT_ID" "PROJECT_ID", "A1"."ASSIGNMENT_PERIOD_START" "ASSIGNMENT_PERIOD_START", "A1"."ASSIGNMENT_PERIOD_END" "ASSIGNMENT_PERIOD_END" FROM (SELECT "A2"."ASSIGNMENT_PERIOD_START" "ASSIGNMENT_PERIOD_START", "A2"."ASSIGNMENT_PERIOD_END" "ASSIGNMENT_PERIOD_END", "A2"."ASSIGNMENT_ID" "ASSIGNMENT_ID", "A2"."PERSON_ID" "PERSON_ID", "A2"."PROJECT_ID" "PROJECT_ID" FROM "DEMO"."PROJECT_ASSIGNMENTS" "A2" WHERE ("A2"."ASSIGNMENT_PERIOD_START" IS NULL OR "A2"."ASSIGNMENT_PERIOD_START" <= systimestamp(6)) AND ("A2"."ASSIGNMENT_PERIOD_END" IS NULL OR "A2"."ASSIGNMENT_PERIOD_END" > systimestamp(6))) "A1"
For more details about Temporal Validity, see Write Less with More – Part 4.