How to Delete Duplicate Records in Oracle

Summary: in this tutorial, you will learn step by step how to delete duplicate records in Oracle Database using the DELETE statement with a subquery.

Once you’ve found the duplicate records in a table, you often want to delete the unwanted copies to keep your data clean.

If a table has a few duplicate rows, you could do this manually one by one by using a simple DELETE statement. However, it is time-consuming to do it manually if the table has a large number of duplicate records. It is better to construct a single statement that removes all unwanted copies in one go.

Before removing the duplicate records, you must decide which instances you want to keep. For example, you may want to preserve the newest or oldest row. In this case, you need a column in the table like id column that is not the part of the group used to evaluate duplicate.

Consider the following fruits table:

how to delete duplicate records in oracle - sample table

-- create fruit table
CREATE TABLE fruits (
        fruit_id   NUMBER generated BY DEFAULT AS IDENTITY,
        fruit_name VARCHAR2(100),
        color VARCHAR2(20),
        PRIMARY KEY (fruit_id)
);

-- insert sample rows
INSERT INTO fruits(fruit_name,color) VALUES('Apple','Red');
INSERT INTO fruits(fruit_name,color) VALUES('Apple','Red');
INSERT INTO fruits(fruit_name,color) VALUES('Orange','Orange');
INSERT INTO fruits(fruit_name,color) VALUES('Orange','Orange');
INSERT INTO fruits(fruit_name,color) VALUES('Orange','Orange');
INSERT INTO fruits(fruit_name,color) VALUES('Banana','Yellow');
INSERT INTO fruits(fruit_name,color) VALUES('Banana','Green');

-- query data from the fruit table
SELECT * FROM fruits; 
Code language: SQL (Structured Query Language) (sql)

Suppose you want to keep the row with the highest fruit_id and delete all other copies. The following query returns the last entry for each combination of fruit_name and color:

SELECT
  MAX(fruit_id)
FROM
  fruits
GROUP BY
  fruit_name,
  color
ORDER BY
  MAX(fruit_id);
Code language: SQL (Structured Query Language) (sql)

how to delete duplicate records in oracle - max id

You use the following DELETE statement with a subquery to delete rows whose values are in the fruit_id column are not the highest ones.

DELETE FROM
  fruits
WHERE
  fruit_id NOT IN
  (
    SELECT
      MAX(fruit_id)
    FROM
      fruits
    GROUP BY
      fruit_name,
      color
  );
Code language: SQL (Structured Query Language) (sql)

Three rows were deleted which is what we expected:

SELECT
  *
FROM
  fruits;
Code language: SQL (Structured Query Language) (sql)

how to delete duplicate records in oracle - after removing duplicates

Likewise, if you want to keep the row with the lowest ID, you use the MIN() function instead of the MAX() function:

DELETE FROM
  fruits
WHERE
  fruit_id NOT IN
  (
    SELECT
      MIN(fruit_id)
    FROM
      fruits
    GROUP BY
      fruit_name,
      color
  );
Code language: SQL (Structured Query Language) (sql)

This method above works if you have a column that is not part of the group for evaluating duplicates. If all values in all columns can have copies, then you cannot use the fruit_id column anymore.

Let’s drop and create the fruits table with a new structure and data as follows:

DROP TABLE fruits;

CREATE TABLE fruits (
        fruit_id   NUMBER,
        fruit_name VARCHAR2(100),
        color VARCHAR2(20)
);

INSERT INTO fruits(fruit_id,fruit_name,color) VALUES(1,'Apple','Red');
INSERT INTO fruits(fruit_id,fruit_name,color) VALUES(1,'Apple','Red');
INSERT INTO fruits(fruit_id,fruit_name,color) VALUES(2,'Orange','Orange');
INSERT INTO fruits(fruit_id,fruit_name,color) VALUES(2,'Orange','Orange');
INSERT INTO fruits(fruit_id,fruit_name,color) VALUES(2,'Orange','Orange');
INSERT INTO fruits(fruit_id,fruit_name,color) VALUES(3,'Banana','Yellow');
INSERT INTO fruits(fruit_id,fruit_name,color) VALUES('4,Banana','Green');

SELECT * FROM fruits;
Code language: SQL (Structured Query Language) (sql)

how to delete duplicate records in oracle - sample table with duplicates in all columns

In the fruits table, the values in all columns fruit_id, fruit_name, and color have copies.

In this case, you can use the rowid which is a physical locator that specifies where on storage Oracle stores the row. Because the rowid is unique to each row, you can use it to remove the duplicates as shown below:

DELETE
FROM
  fruits
WHERE
  rowid NOT IN
  (
    SELECT
      MIN(rowid)
    FROM
      fruits
    GROUP BY
      fruit_id,
      fruit_name,
      color
  );Code language: SQL (Structured Query Language) (sql)

The following query verifies the delete operation:

SELECT
  *
FROM
  fruits;Code language: SQL (Structured Query Language) (sql)

how to delete duplicate records in oracle - after deleting copies

It worked as expected.

Now you should know how to delete duplicate records in the Oracle Database.

Was this tutorial helpful?