The Constraint Optimization series:
- Part 1: Optimization of Check Constraint Creation
- Part 2: Optimization that Violates Data Integrity
- Part 3: Optimization of Foreign Key Constraint Creation
- Part 4: (Lack of) Optimization of Unique Constraint Creation
- Part 5: Adding a Column with a Default Value and a Constraint
- Part 6: Fast but Offline, or Online but Slow?
- Part 7: Adding a Unique Constraint in an Online Way
- Part 8: Constraint Optimization Summary
In the previous post I showed that Oracle does a nice optimization when we add a new (nullable with no default value) column with an inline check constraint in a single ALTER TABLE statement.
However, there is one case where this optimization allows for data integrity violation instead of forbidding it (which makes it a bug, in this specific case, rather than an optimization). It happens when the check constraint is “column IS NOT NULL”.
I’ll setup a table with a few records for demonstrating this:
SQL> create table t (x number); Table created. SQL> insert into t select rownum from dual connect by level<=10; 10 rows created. SQL> commit; Commit complete.
First, let's try to add a column with an out-of-line constraint, so the optimization will not apply:
SQL>
SQL> ALTER TABLE T ADD (
2 C NUMBER,
3 CONSTRAINT C_CHK CHECK (C IS NOT NULL)
4 );
CONSTRAINT C_CHK CHECK (C IS NOT NULL)
*
ERROR at line 3:
ORA-02293: cannot validate (TRANZMATE_PRODUCTION.C_CHK) - check constraint violated
It failed, and that's good.
Now, let's try to add the same column with the same check constraint, but this time we'll use an inline constraint:
SQL> ALTER TABLE T ADD ( 2 C NUMBER CONSTRAINT C_CHK CHECK (C IS NOT NULL) 3 ); Table altered.
And it succeeded, although it should have failed.
So now we have a validated constraint, stating that C IS NOT NULL, and at the same time the table contains records, and in all of them C IS NULL:
SQL> select table_name,search_condition,status,validated,rely 2 from user_constraints where constraint_name='C_CHK'; TABLE_NAME SEARCH_CONDITION STATUS VALIDATED RELY ---------- -------------------- ---------- ---------- ------------ T C IS NOT NULL ENABLED VALIDATED SQL> select * from t; X C ---------- ---------- 1 2 3 4 5 6 7 8 9 10 10 rows selected.
Since we're talking about adding a nullable column with no default value to an existing table that already contains rows, specifying a check constraint that states that this column is not nullable doesn't make much sense, and if someone tries to do it it's probably their bug. Still, it would be nice to get an error in this case, rather than violating the data integrity.
Why does it happen?
When Oracle validates some check constraint, it checks if there are records for which the constraint condition is FALSE. When the column is NULL, then usually the condition evaluates to NULL, not to FALSE, so the constraint is valid. For example, when C is NULL, then "C>0" is NULL (neither FALSE nor TRUE).
I guess this is why Oracle skips the validation step completely as part of this optimization.
What should happen?
It's good that Oracle doesn't validate each and every record, as it's really unnecessary, but to make this optimization foolproof it should validate a single value – NULL.
Hi Oren,
It looks like the issue has already been reported:
Bug 1290306 : ADD COLUMN AND CONSTRAINT IN ONE STATEMENT – LEAVES VIOLATED CONSTRAINT
Though it is unlikely to be resolved any time soon as it has not been resolved for last 17 years.
For this particular case, I would go on with the NOT NULL constraint.
I understand that you probably meant a broader case with the inline CHECK constraint violating existing table data.
SQL> alter table t add c number constraint c_nn not null;
alter table t add c number constraint c_nn not null
*
ERROR at line 1:
ORA-01758: table must be empty to add mandatory (NOT NULL) column
I have been working with a lot of people confusing the NOT NULL constraint with CHECK (col is not null) – they are not the same.
Not only do they have different CDEF$.TYPE# values, but also NOT NULL constraints are usually slightly faster 🙂
Or sometimes more than slightly:
SQL> alter table t add c number constraint c_nn not null;
Table altered.
SQL> explain plan for select * from t where c is null;
Explained.
-- FTS (full table scan) was completely eliminated
SQL> @?/rdbms/admin/utlxpls
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1401356643
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 0 (0)| |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| T | 1 | 2 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(NULL IS NOT NULL)
SQL> alter table t add ck number constraint c_ck check(ck is not null);
Table altered.
SQL> explain plan for select * from t where ck is null;
Explained.
-- Here we scan data
SQL> @?/rdbms/admin/utlxpls
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 2153619298
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 2 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("CK" IS NULL)
To top it off the optimizer is aware of differences between NOT NULL and CHECK constraints and does not generate a transitive predicate in the second case above, such as one NULL IS NOT NULL in the first case.
Otherwise, we would have ended up with wrong results in the second case.