Skip to main content

Command Palette

Search for a command to run...

LeetCode 196: Delete Duplicate Emails

LeetCode 196 Solution: How to Remove Duplicate Emails

Updated
2 min read
LeetCode 196: Delete Duplicate Emails
V

I'm Varchasv, a Data Engineer working on enterprise data integration.

Currently on a 90-problem challenge to level up my technical skills and switch to a more development-focused role.

What I'm doing:

  • Solving 2 Leetcode problems daily (SQL + DSA).
  • Blogging about each problem.
  • Building in public.

My Goal - Land a better data engineering role by mid-2026.

Follow my journey !!

Date: January 20, 2026
Category: SQL
Time Taken: 20 minutes
Difficulty: Easy


Problem Statement

Write a solution to delete all duplicate emails, keeping only one unique email with the smallest id.

For SQL users, please note that you are supposed to write a DELETE statement and not a SELECT one.

Link: Delete Duplicate Emails


My Approach

Initial thought:

  • I initially tried using DENSE_RANK() to assign a rank to each row, allowing me to delete any id where rank > 1.

  • This approach was incorrect because DENSE_RANK() assigns the same rank to identical emails.

Solution 1 (ROW_NUMBER):

  • Create a CTE using ROW_NUMBER() and PARTITION BY email to assign a unique rank to each row.

  • Then, use a SELECT subquery within the DELETE statement to choose the ids where rank > 1.

Why this works: This method is effective because, unlike DENSE_RANK(), ROW_NUMBER() assigns a unique rank to identical emails, whereas DENSE_RANK() does the opposite in this case.

Solution 2 (Self-join):

Do a self-join on the table based on the rows having the same email but not the same id.

Why this works: This method works because we don’t need to perform any complex transformations or use window functions.

The self-join solution is good when we need an easier and less complex method, but ROW_NUMBER() is the more modern method and it avoids any cartesian product problems.


Solution Code (ROW_NUMBER)

WITH CTE AS(
    SELECT id,
    ROW_NUMBER() OVER (PARTITION BY email ORDER BY id ASC) AS rnk
    FROM Person
)

DELETE FROM Person
WHERE id in (SELECT id FROM CTE WHERE rnk > 1)

Solution Code (Self-join)

DELETE p1
FROM Person AS p1
JOIN Person AS p2
ON p1.email = p2.email AND p1.id > p2.id

Key Takeaway: ROW_NUMBER() should be used for deduplication.

Pattern: Self-join, Window functions

Mistakes I Made:

  • Did not know how DELETE clause works.

  • Used DENSE_RANK() instead of ROW_NUMBER() for deduplication.

Series: 90 Days of Data Engineering Progress: 11/90 problems completed

Tags: #DEQuest #LeetCode #SQL #DataEngineering #BuildInPublic

90 Days of Data Engineering

Part 12 of 48

Solving 90 problems over 18 weeks. Daily posts Mon-Fri. Goal: Switch to development role.

Up next

LeetCode 238: Product of Array Except Self

Solving LeetCode 238: Product of Array Except Self

Remove Duplicate Emails: LeetCode 196