Find the Duplicate Rows and remove in Oracle and add the primary key contrint

When you try to enforce a primary key on an existing table with duplicate values, Oracle will throw an error because primary keys must be unique and non-null.

Here’s how you can identify and remove duplicates and then add the primary key constraint safely.


✅ Step-by-Step Process

🔹 1. Find the Duplicate Rows

Assume your table is my_table and you want to set a primary key on the column id.

SELECT id, COUNT(*)
FROM my_table
GROUP BY id
HAVING COUNT(*) > 1;

This query gives you the list of ids that are duplicated.


🔹 2. Find Full Duplicate Records

To review full rows with duplicates:

SELECT *
FROM my_table
WHERE id IN (
  SELECT id
  FROM my_table
  GROUP BY id
  HAVING COUNT(*) > 1
);

🔹 3. Remove Duplicates (Keep One Row)

If you're sure you can delete duplicates and keep only one of each, you can use ROWID like this:

DELETE FROM my_table
WHERE ROWID NOT IN (
  SELECT MIN(ROWID)
  FROM my_table
  GROUP BY id
);

⚠️ Be careful — this deletes all but one row for each duplicate id. Back up your data before running this on production.


🔹 4. Add Primary Key Constraint

Now that duplicates are gone, you can safely add the constraint:

ALTER TABLE my_table
ADD CONSTRAINT pk_my_table_id PRIMARY KEY (id);

🛑 What If You Don't Want to Delete Data?

If you don’t want to delete but still want to enforce uniqueness moving forward:

  • Create a new table, clean the data, and migrate valid rows.
  • Or add a unique index with IGNORE_ROW_ON_DUPKEY_INDEX (from Oracle 11g onwards) to skip duplicates, but this works mostly in inserts, not constraint creation.


Here's a complete PL/SQL script that will:

  • Identify and log duplicates into a backup table.
  • Delete the duplicates (keeping one row for each duplicate key).
  • Add a primary key constraint safely.

  • Let’s assume:
  • Your table is named my_table.
  • The column to become primary key is id.


PL/SQL Script: Log & Remove Duplicates, Then Add Primary Key

-- 1. Create a backup table for duplicates
BEGIN
   EXECUTE IMMEDIATE 'CREATE TABLE my_table_duplicates AS SELECT * FROM my_table WHERE 1=0';
EXCEPTION
   WHEN OTHERS THEN
      IF SQLCODE != -955 THEN -- Table already exists
         RAISE;
      END IF;
END;
/

-- 2. Insert duplicate rows into the backup table
INSERT INTO my_table_duplicates
SELECT *
FROM my_table
WHERE id IN (
   SELECT id
   FROM my_table
   GROUP BY id
   HAVING COUNT(*) > 1
);

-- 3. Delete duplicate rows (keep one row per ID)
DELETE FROM my_table
WHERE ROWID NOT IN (
   SELECT MIN(ROWID)
   FROM my_table
   GROUP BY id
);

-- 4. Add primary key constraint
BEGIN
   EXECUTE IMMEDIATE 'ALTER TABLE my_table ADD CONSTRAINT pk_my_table_id PRIMARY KEY (id)';
EXCEPTION
   WHEN OTHERS THEN
      DBMS_OUTPUT.PUT_LINE('Error adding primary key: ' || SQLERRM);
END;
/

🧾 Notes:

  • Step 1: Creates a table my_table_duplicates to log the duplicates.
  • Step 2: Copies all duplicate rows into the backup.
  • Step 3: Deletes all duplicates except one per id.
  • Step 4: Tries to add the primary key constraint.


🧠 Optional Improvements

  • Add a timestamp or user info to my_table_duplicates for audit trail.
  • Parameterize for use with other tables or columns.


Post a Comment

And that's all there is to it!

If anyone has any other questions or requests for future How To posts, you can either ask them in the comments or email me. Please don't feel shy at all!

I'm certainly not an expert, but I'll try my hardest to explain what I do know and research what I don't know.

Previous Post Next Post