EXISTS Conditions: An Condition Tests For Existence of Rows in A Subquery
EXISTS Conditions: An Condition Tests For Existence of Rows in A Subquery
exists_condition::=
Let us take a table containing 3 columns, then we can use the following command to delete the duplicate
rows from the table.
group by col1,col2,col3);
Table: EMP
EMPNO NUMBER
ENAME VARCHAR2(20)
JOB VARCHAR2(20)
commit;
end;
/
begin
for i in 1..20 loop
insert into emp values (2,'yy','accountant');
end loop;
commit;
end;
/
begin
Using the previous method as in your TIP for the Week 08/04/2002
------------------------------------------------------------------------
SQL> select count(*) from emp;
COUNT(*)
----------
30040
Elapsed: 00:03:207.48
COUNT(*)
----------
4
Elapsed: 00:00:00.10
-------------------------------------------
SQL> select count(*) from emp;
COUNT(*)
----------
30040
SQL> delete from emp where rowid in (
2 SELECT rowid FROM emp
3 group by rowid,empno,ename,job
4 minus
5 SELECT min(rowid) FROM emp
6 group by empno,ename,job);
30036 rows deleted.
Elapsed: 00:00:02.94
COUNT(*)
----------
4
Elapsed: 00:00:00.10
--------------------------------------------------------------------
As we can see the difference is multifold to achieve the same result. This is because the new method uses
the set operator to compute the list of duplicate rows. The bigger the table the better you can appreciate the
difference.
>>> MIN() allows you to select one row per group—duplicates and non-duplicates—so that you
get a list of all the rows you want to keep:
Now you just need to delete rows that are not in this list, using the last query as a subquery inside
an antijoin (the NOT IN clause):
Another disadvantage of this syntax is that you can't control which row per group of duplicates
you can keep in the database.
1. It selects the duplicate data in the cursor, sorted by duplicate key (LastName, FirstName
in our case), as shown in Listing 4.
2. It opens the cursor and fetches each row, one by one, in a loop.
3. It compares the duplicate key value with the previously fetched one.
4. If this is a first fetch, or the value is different, then that's the first row in a new group so it
skips it and fetches the next row. Otherwise, it's a duplicate row within the same group,
so it deletes it.
Let's run the stored procedure and check it against the Customers data:
BEGIN
DeleteDuplicates;
END;
/
The main job of extracting duplicates in this procedure is done by a SQL statement, which is
defined in the csr_Duplicates cursor. The PL/SQL procedural code is used only to implement the
logic of deleting all rows in the group except the first one. Could it all be done by one SQL
statement?