In a recent post I suggested a way to write a single SQL query that filters a table by one or more of several columns.
Here is the query from that post:
select * from employees where rowid in ( select rid from ( select rowid rid from employees where department_id = :department_id union all select rowid rid from employees where job_id = :job_id union all select rowid rid from employees where manager_id = :manager_id union all select rowid rid from employees where last_name = :last_name) group by rid having count(*) = nvl2(:department_id,1,0) + nvl2(:job_id,1,0) + nvl2(:manager_id,1,0) + nvl2(:last_name,1,0) );
This is actually a simplification of a real task I handled some time ago.
In today’s post I’d like to look at a variation of the problem, which is closer to the real issue I handled.
In the previous example we had 4 columns that each one of them may, or may not, be filtered by. Now, let’s think of another case, where, in addition to the 4 optional filters, there is one mandatory filter.
For example, let’s extend the HR data model, so it will include companies, and each company has its own departments, jobs and employees. There should be no dependency between records from different companies.
So I’ll add a new table – COMPANIES, and a mandatory column – COMPANY_ID – to each of the other tables (DEPARTMENTS, JOBS, and EMPLOYEES). COMPANY_ID will become part of the primary key in every table, and all the foreign key constraints from EMPLOYEES to the other tables will include the COMPANY_ID column as well.
Here are the DDL statements for this model:
create table companies ( company_id number(4) not null, company_name varchar2(30) not null, -- constraint companies_pk primary key (company_id) ); create table departments ( company_id number(4) not null, department_id number(4) not null, department_name varchar2(30) not null, manager_id number(6), location_id number(4), -- constraint departments_pk primary key (company_id,department_id) ); create table jobs( company_id number(4) not null, job_id varchar2(10) not null, job_title varchar2(35) not null, min_salary number(6), max_salary number(6), -- constraint jobs_pk primary key(company_id,job_id) ); create table employees ( company_id number(4) not null, employee_id number(6) not null, first_name varchar2(20), last_name varchar2(25) not null, email varchar2(25) not null, phone_number varchar2(20), hire_date date not null, job_id varchar2(10) not null, salary number(8,2), commission_pct number(2,2), manager_id number(6), department_id number(4), -- constraint emp_pk primary key (company_id,employee_id), constraint emp_salary_min check (salary > 0), constraint emp_email_uk unique (company_id,email), constraint emp_dept_fk foreign key (company_id,department_id) references departments (company_id,department_id), constraint emp_job_fk foreign key (company_id,job_id) references jobs (company_id,job_id), constraint emp_manager_fk foreign key (company_id,manager_id) references employees (company_id,employee_id) ); create index emp_department_ix on employees (company_id,department_id); create index emp_job_ix on employees (company_id,job_id); create index emp_manager_ix on employees (company_id,manager_id); create index emp_name_ix on employees (company_id, last_name, first_name);
And now we want to modify the SQL query, so it will filter the employees by COMPANY_ID (always), and by zero or more of the following columns: DEPARTMENT_ID, JOB_ID, MANAGER_ID and LAST_NAME. For that I’ll make 3 changes in the original query:
- One change is adding the condition on COMPANY_ID. Note how we can add it once, and let the optimizer push the predicate into each one of the UNION ALL branches.
- A second change is adding another branch to the UNION ALL, to handle the case of zero optional filters (the original query handled one or more optional filters).
- And finally we need to change the HAVING clause. The original expression in the HAVING clause is evaluated to the number of non-null variable values – between 1 and 4. Now, if all the optional variables are nulls it is evaluated to 0, but then (and only then) the new branch of the UNION ALL is executed, so we’d like to count exactly 1 result per rowid. Hence, we take the greatest number between 1 and the original expression.
Here is the revised query:
select * from employees where rowid in ( select rid from ( select rowid rid,company_id from employees where department_id = :department_id union all select rowid rid,company_id from employees where job_id = :job_id union all select rowid rid,company_id from employees where manager_id = :manager_id union all select rowid rid,company_id from employees where last_name = :last_name union all select rowid rid,company_id from employees where :department_id is null and :job_id is null and :manager_id is null and :last_name is null) where company_id = :company_id group by rid having count(*) = greatest (1, nvl2(:department_id,1,0) + nvl2(:job_id,1,0) + nvl2(:manager_id,1,0) + nvl2(:last_name,1,0) ) );
And here is the execution plan:
--------------------------------------------------------- | Id | Operation | Name | --------------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | NESTED LOOPS | | | 2 | VIEW | VW_NSO_1 | |* 3 | FILTER | | | 4 | HASH GROUP BY | | | 5 | VIEW | | | 6 | UNION-ALL | | |* 7 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | |* 8 | INDEX RANGE SCAN | EMP_JOB_IX | |* 9 | INDEX RANGE SCAN | EMP_MANAGER_IX | |* 10 | INDEX RANGE SCAN | EMP_NAME_IX | |* 11 | FILTER | | |* 12 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | | 13 | TABLE ACCESS BY USER ROWID| EMPLOYEES | --------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(COUNT(*)=GREATEST(1,NVL2(:DEPARTMENT_ID,1,0)+NVL2(:JOB_ID,1,0)+NVL2(:MANAGE R_ID,1,0)+NVL2(:LAST_NAME,1,0))) 7 - access("COMPANY_ID"=TO_NUMBER(:COMPANY_ID) AND "DEPARTMENT_ID"=TO_NUMBER(:DEPARTMENT_ID)) 8 - access("COMPANY_ID"=TO_NUMBER(:COMPANY_ID) AND "JOB_ID"=:JOB_ID) 9 - access("COMPANY_ID"=TO_NUMBER(:COMPANY_ID) AND "MANAGER_ID"=TO_NUMBER(:MANAGER_ID)) 10 - access("COMPANY_ID"=TO_NUMBER(:COMPANY_ID) AND "LAST_NAME"=:LAST_NAME) 11 - filter(:LAST_NAME IS NULL AND :MANAGER_ID IS NULL AND :JOB_ID IS NULL AND :DEPARTMENT_ID IS NULL) 12 - access("COMPANY_ID"=TO_NUMBER(:COMPANY_ID))