Overview
The FILTER clause is a nice addition to the SQL syntax in Oracle 26ai (23.26.1). It allows to restrict the rows considered by aggregate functions.
An aggregate query divides a set of records into groups, and returns a single result row for each group.
By default, the input for every aggregate function in the query includes all the records of the group.
However, sometimes we want the input to a specific aggregate function in the query to be only a subset of the group’s records. The FILTER clause provides an elegant solution for this requirement.
Example
As an example, I’ll use the EUROVISION_WINNERS table that I created for my Pattern Matching presentation.
This table contains a record for every year of the Eurovision Song Contest, with some details about the winner of that year.
SQL> select * from eurovision_winners order by year;
I want to analyze the languages of the winning songs. For each language, I’d like to get:
- The total number of wins
- The average points for wins in the 20th century
- The average points for wins in the 21st century
- The last year in which the winning country also hosted the contest
The first aggregate – the total number of wins – is simple, because it considers all the group’s records.
But for the other aggregates we need to consider only a subset of the group’s records.
Pre-26ai Solution
In versions before 26ai, a common practice to achieve this leveraged the fact that aggregate functions ignore NULLs.
So, instead of using AVG(POINTS) or MAX(YEAR), we apply the functions on expressions that return POINTS or YEAR for the records we want to consider, and NULL for all the records we want to filter out.
SQL> select language, 2 count(*) total_wins, 3 round(avg(case when year<=2000 then points end)) average_points_20th_century, 4 round(avg(case when year>2000 then points end)) average_points_21st_century, 5 max(case when winner=host then year end) last_win_at_home 6 from eurovision_winners 7 group by language 8 order by language;
The FILTER Clause
In 26ai, the FILTER clause provides a dedicated solution for this requirement.
We can add the required filter to each individual aggregate function as a Boolean condition:
SQL> select language, 2 count(*) total_wins, 3 round(avg(points) filter (where year<=2000)) average_points_20th_century, 4 round(avg(points) filter (where year>2000)) average_points_21st_century, 5 max(year) filter (where winner=host) last_win_at_home 6 from eurovision_winners 7 group by language 8 order by language; TOTAL AVG_POINTS AVG_POINTS LAST_WIN LANGUAGE WINS 20TH_CENTURY 21ST_CENTURY AT_HOME ------------- ----- ------------ ------------ ---------- Crimean Tatar 1 534 Croatian 1 137 Danish 1 42 Dutch 3 23 English 33 146 341 1994 French 13 94 1973 German 2 96 Hebrew 3 151 1979 Italian 3 99 524 Norwegian 2 136 Portuguese 1 758 Serbian 1 268 Spanish 1 29 Swedish 2 146 Ukrainian 2 456 15 rows selected.
I like it! This syntax conveys the goal that we want to achieve in a more clear and straightforward way.
Under the Hood
Let’s see how the new syntax is implemented. By using DBMS_UTILITY.EXPAND_SQL_TEXT, we can see that the new syntax is simply converted into the pre-26ai solution:
SQL> var x clob SQL> begin 2 dbms_utility.expand_sql_text( 3 input_sql_text => ' 4 select language, 5 count(*) total_wins, 6 round(avg(points) filter (where year<=2000)) average_points_20th_century, 7 round(avg(points) filter (where year>2000)) average_points_21st_century, 8 max(year) filter (where winner=host) last_win_at_home 9 from eurovision_winners 10 group by language 11 order by language', 12 output_sql_text => :x); 13 end; 14 / PL/SQL procedure successfully completed. SQL> print x -- I formatted the output to make it more readable X -------------------------------------------------------------------------------- SELECT "A1"."LANGUAGE" "LANGUAGE", COUNT(*) "TOTAL_WINS", ROUND(AVG(CASE WHEN "A1"."YEAR"<=2000 THEN "A1"."POINTS" ELSE NULL END )) "AVERAGE_POINTS_20TH_CENTURY", ROUND(AVG(CASE WHEN "A1"."YEAR">2000 THEN "A1"."POINTS" ELSE NULL END )) "AVERAGE_POINTS_21ST_CENTURY", MAX(CASE WHEN "A1"."WINNER"="A1"."HOST" THEN "A1"."YEAR" ELSE NULL END ) "LAST_WIN_AT_HOME" FROM "DEMO2"."EUROVISION_WINNERS" "A1" GROUP BY "A1"."LANGUAGE" ORDER BY "A1"."LANGUAGE"
This is an approach that Oracle uses quite often – supporting new syntax by converting the SQL statement into an equivalent statement that was already supported.
I think this is a clever approach. It requires relatively little effort and carries low risk.
If you’d like to see more examples of this approach, I wrote a post about it several years ago: EXPAND_SQL_TEXT – Much More Than Just Expanding Views.
The HAVING Clause
Since aggregate functions can also be used in the HAVING clause, we can obviously use the FILTER clause there as well. Let’s write a query that returns for each language:
- The total number of wins
- The number of unique winners
- The list of years in which the winning country also hosted the contest
But this time we don’t want to get all the languages, but only the ones that had at least 2 wins in the 20th century. We’ll use the HAVING clause for that.
SQL> select language,
2 count(*) total_wins,
3 count(distinct winner) unique_winners,
4 listagg(year,',') within group (order by year) filter (where winner=host) wins_at_home
5 from eurovision_winners
6 group by language
7 having count(*) filter (where year<=2000) >= 2
8 order by language;
TOTAL UNIQUE WINS
LANGUAGE WINS WINNERS AT_HOME
------------- ----- ------- --------------------
Dutch 3 1
English 33 17 1993,1994
French 13 5 1956,1973
German 2 2
Hebrew 3 1 1979
Italian 3 1
Norwegian 2 1
Swedish 2 1
8 rows selected.
Subqueries
Note that one of the documented restrictions is that we cannot use subqueries in the FILTER clause. So, for example, although the following query is supported…
SQL> select language,
2 count(*) total_wins,
3 count(case when not exists(select null from eurovision_winners w2 where w2.winner=w1.winner and w2.year!=w1.year) then 1 end) one_time_winners
4 from eurovision_winners w1
5 group by language
6 order by language;
TOTAL ONE_TIME
LANGUAGE WINS WINNERS
------------- ----- --------
Crimean Tatar 1 0
Croatian 1 1
Danish 1 0
Dutch 3 0
English 33 7
French 13 2
German 2 0
Hebrew 3 0
Italian 3 0
Norwegian 2 0
Portuguese 1 1
Serbian 1 1
Spanish 1 1
Swedish 2 0
Ukrainian 2 0
15 rows selected.
…the equivalent query with FILTER is not supported:
SQL> select language, 2 count(*) total_wins, 3 count(*) filter (where not exists(select null from eurovision_winners w2 where w2.winner=w1.winner and w2.year!=w1.year)) one_time_winners 4 from eurovision_winners w1 5 group by language 6 order by language; count(*) filter (where not exists(select null from eurovision_winners w2 where w2.winner=w1.winner and w2.year!=w1.year)) one_time_winners * ERROR at line 3: ORA-02251: subquery not allowed here
Analytic Functions
The FILTER clause can also be used with aggregate functions that operate as analytic functions.
For example, let’s get for every year, in addition to the winning country and song, also the number of times the same winner won in the 20th century and in the 21st century:
SQL> select w.year, 2 w.winner, 3 w.song, 4 count(*) filter (where year<=2000) over (partition by winner) winner_wins_20th_century, 5 count(*) filter (where year>2000) over (partition by winner) winner_wins_21st_century 6 from eurovision_winners w 7 order by year; WINNER_WINS WINNER_WINS YEAR WINNER SONG 20TH_CENTURY 21ST_CENTURY ---------- --------------- ----------------------------------- ------------ ------------ 1956 Switzerland Refrain 2 1 1957 Netherlands Net als toen 4 1 1958 France Dors, mon amour 4 0 1959 Netherlands Een beetje 4 1 1960 France Tom Pillibi 4 0 1961 Luxembourg Nous les amoureux 5 0 1962 France Un premier amour 4 0 1963 Denmark Dansevise 2 1 1964 Italy Non ho l'età 2 1 1965 Luxembourg Poupée de cire, poupée de son 5 0 1966 Austria Merci, Chérie 1 2 1967 United Kingdom Puppet on a String 4 0 1968 Spain La, la, la 1 0 1969 Netherlands De troubadour 4 1 1970 Ireland All Kinds of Everything 7 0 1971 Monaco Un banc, un arbre, une rue 1 0 1972 Luxembourg Après toi 5 0 1973 Luxembourg Tu te reconnaîtras 5 0 1974 Sweden Waterloo 4 3 1975 Netherlands Ding-a-dong 4 1 1976 United Kingdom Save Your Kisses for Me 4 0 1977 France L'oiseau et l'enfant 4 0 1978 Israel A-Ba-Ni-Bi 3 1 1979 Israel Hallelujah 3 1 1980 Ireland What's Another Year 7 0 1981 United Kingdom Making Your Mind Up 4 0 1982 Germany Ein bißchen Frieden 1 1 1983 Luxembourg Si la vie est cadeau 5 0 1984 Sweden Diggi-Loo Diggi-Ley 4 3 1985 Norway La det swinge 2 1 1986 Belgium J'aime la vie 1 0 1987 Ireland Hold Me Now 7 0 1988 Switzerland Ne partez pas sans moi 2 1 1989 Yugoslavia Rock Me 1 0 1990 Italy Insieme: 1992 2 1 1991 Sweden Fångad av en stormvind 4 3 1992 Ireland Why Me? 7 0 1993 Ireland In Your Eyes 7 0 1994 Ireland Rock 'n' Roll Kids 7 0 1995 Norway Nocturne 2 1 1996 Ireland The Voice 7 0 1997 United Kingdom Love Shine a Light 4 0 1998 Israel Diva 3 1 1999 Sweden Take Me to Your Heaven 4 3 2000 Denmark Fly on the Wings of Love 2 1 2001 Estonia Everybody 0 1 2002 Latvia I Wanna 0 1 2003 Turkey Everyway That I Can 0 1 2004 Ukraine Wild Dances 0 3 2005 Greece My Number One 0 1 2006 Finland Hard Rock Hallelujah 0 1 2007 Serbia Molitva 0 1 2008 Russia Believe 0 1 2009 Norway Fairytale 2 1 2010 Germany Satellite 1 1 2011 Azerbaijan Running Scared 0 1 2012 Sweden Euphoria 4 3 2013 Denmark Only Teardrops 2 1 2014 Austria Rise Like a Phoenix 1 2 2015 Sweden Heroes 4 3 2016 Ukraine 1944 0 3 2017 Portugal Amar pelos dois 0 1 2018 Israel Toy 3 1 2019 Netherlands Arcade 4 1 2021 Italy Zitti e buoni 2 1 2022 Ukraine Stefania 0 3 2023 Sweden Tattoo 4 3 2024 Switzerland The Code 2 1 2025 Austria Wasted Love 1 2 69 rows selected.

One thought on “Aggregation Filters”