I’ve been told that using NOT EXISTS
in (Oracle) SQL is a bad idea, and that a way to overcome this problem is to collect the non-matching rows with an OUTER JOIN
. So I decided to check if it is true.
In order to start, here is my test case:
create table t1(id number, constraint t1_pk primary key(id)); create table t2(id number); begin for i in 1..100 loop insert into t1 values(i); end loop; commit; end; begin for i in 1..100000 loop insert into t2 values(mod(i,97)); end loop; commit; end; / create index t2_idx on t2(id); exec dbms_stats.gather_table_stats(USER,'T1'); exec dbms_stats.gather_table_stats(USER,'T2');
First, I checked what I’d been told, i.e. that the OUTER JOIN
is more efficient than the NOT EXISTS
. In order to do that, I wrote a simple SELECT
and displayed the plan for both syntaxes (my database is 11.1.0.6 on Linux 32-bits). As I assumed, it’s not the case. In fact, both orders took the same plan.
Here is the plan with NOT EXISTS
:
explain plan for select id from t1 a where not exists (select 1 from t2 b where b.id=a.id); select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ----------------------------------------------------------------- Plan hash value: 1906534000 ----------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| ----------------------------------------------------------------- | 0 | SELECT STATEMENT | | 96 | 480 | 46 (5)| |* 1 | HASH JOIN ANTI | | 96 | 480 | 46 (5)| | 2 | INDEX FULL SCAN | T1_PK | 100 | 300 | 1 (0)| | 3 | TABLE ACCESS FULL| T2 | 100K| 195K| 44 (3)| ----------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("B"."ID"="A"."ID")
Here is the plan with the OUTER JOIN
:
explain plan for select a.id from t1 a, t2 b where a.id=b.id(+) and b.id is null; select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ----------------------------------------------------------------- Plan hash value: 1906534000 ----------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| ----------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 15 | 46 (5)| |* 1 | HASH JOIN ANTI | | 3 | 15 | 46 (5)| | 2 | INDEX FULL SCAN | T1_PK | 100 | 300 | 1 (0)| | 3 | TABLE ACCESS FULL| T2 | 100K| 195K| 44 (3)| ----------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("A"."ID"="B"."ID")
I know — the real way to check that both queries are equivalent is to trace the plan generation with a 10053 event. (I cannot explain this, so I’ll leave that to you.) However, the original query I’ve been told to rewrite was not a SELECT
, but the DELETE
below:
delete from t1 a where not exists (select 1 from t2 b where b.id=a.id);
I haven’t yet found how to rewrite it in a way that makes it more efficient with an OUTER JOIN
. I’ll be happy if someone can help me, at least to find the syntax, if not to enhance response time.
4 Comments. Leave new
delete from t1 a
where exists
(select 1 from t1 one left outer join t2 two on two.id = one.id where two.id is null);
just stumbled upon this, so maybe useless by now…
Anyway, the delete should be similar to select of the missing records:
select a.id from t1 a left join t2 b on a.id=b.id where b.id is null;
just apply the delete syntax of your DBMS:
delete from t1 a left join t2 b on a.id=b.id where b.id is null;
or in MSAccess:
delete a.* from t1 as a left join t2 as b on a.id=b.id where b.id is null;
hope it helps.
You can do this and get the same result…
delete
from t1 a
where 0 = (select count(b.id)
from t2 b
where b.id = a.id);
[…] Oracle: Is OUTER JOIN Better Than NOT EXISTS? […]