Delete duplicate rows in sql Server
We will discuss deleting all duplicate rows except one from a sql server table.
Let me explain what we want to achieve. We will be using Employees table for this example.
Delete duplicate rows in sql
SQL Script to create Employees table
Create table Employees
(
ID int,
FirstName nvarchar(50),
LastName nvarchar(50),
Gender nvarchar(50),
Salary int
)
GO
Insert into Employees values (1, ‘Mark’, ‘Hastings’, ‘Male’, 60000)
Insert into Employees values (1, ‘Mark’, ‘Hastings’, ‘Male’, 60000)
Insert into Employees values (1, ‘Mark’, ‘Hastings’, ‘Male’, 60000)
Insert into Employees values (2, ‘Mary’, ‘Lambeth’, ‘Female’, 30000)
Insert into Employees values (2, ‘Mary’, ‘Lambeth’, ‘Female’, 30000)
Insert into Employees values (3, ‘Ben’, ‘Hoskins’, ‘Male’, 70000)
Insert into Employees values (3, ‘Ben’, ‘Hoskins’, ‘Male’, 70000)
Insert into Employees values (3, ‘Ben’, ‘Hoskins’, ‘Male’, 70000)
The delete query should delete all duplicate rows except one. The output should be as shown below, after the delete query is executed.
Delete all duplicate rows except one in sql
Here is the SQL query that does the job. PARTITION BY divides the query result set into partitions.
WITH EmployeesCTE AS
(
SELECT *, ROW_NUMBER()OVER(PARTITION BY ID ORDER BY ID) AS RowNumber
FROM Employees
)
DELETE FROM EmployeesCTE WHERE RowNumber > 1