How to manually delete duplicate rows with MS SQL Server

Often you just have a couple of duplicate rows that you want to delete.
It's much easier and simpler to delete them manually.
But how do you delete all the duplicates except for the one you want to keep?

To delete duplicate rows in bulk see: How to delete duplicate rows in batch

Data used

use sql911;
go

select emp_short, 
       sales_date, 
       sales_qty as 'Sold'
from sales
order by emp_short, sales_date;
go
emp_short  sales_date              Sold
---------- ----------------------- ---------------------------------------
000010     2005-04-06 15:01:01.000 4.00
000010     2005-11-06 10:01:01.000 9.00
000010     2005-12-06 11:01:01.000 8.00
000010     2005-12-06 11:01:01.000 8.00
000010     2005-12-06 11:01:01.000 8.00
000010     2006-04-06 15:01:01.000 4.00
000010     2006-05-06 16:01:01.000 3.00
000010     2006-05-06 16:01:01.000 3.00
000010     2006-07-06 19:01:01.000 1.00
000010     2007-04-06 00:00:00.000 4.00
000010     2007-04-06 00:00:00.000 4.00
000010     2007-04-06 00:00:00.000 4.00
000010     2007-04-06 00:00:00.000 4.00
000010     2007-06-06 17:01:01.000 2.00
000030     2004-02-06 13:01:01.000 6.00
000030     2005-10-06 10:01:01.000 10.00
000030     2006-01-06 12:01:01.000 7.00
000030     2006-04-06 15:01:01.000 4.00
000030     2006-04-06 15:01:01.000 4.00
000030     2006-06-06 17:01:01.000 2.00
000030     2007-05-01 00:00:00.000 5.00
000030     2007-05-01 00:00:00.000 5.00
000030     2007-05-01 00:00:00.000 5.00
000030     2007-05-01 00:00:00.000 5.00
000030     2007-05-01 00:00:00.000 5.00
000030     2007-05-01 00:00:00.000 5.00
000030     2007-05-01 00:00:00.000 5.00
000030     2007-11-06 10:01:01.000 9.00
000110     2004-03-06 14:01:01.000 5.00
000110     2005-11-06 10:01:01.000 9.00
000110     2006-02-06 13:01:01.000 6.00
000110     2006-03-06 14:01:01.000 5.00
000110     2006-05-06 16:01:01.000 3.00
000110     2006-06-06 17:01:01.000 2.00
000110     2006-07-06 19:01:01.000 1.00
000110     2006-07-06 19:01:01.000 1.00
000190     2005-03-06 14:01:01.000 5.00
000190     2005-05-06 16:01:01.000 3.00
000190     2005-12-06 11:01:01.000 8.00
000190     2006-01-06 12:01:01.000 7.00
000190     2006-01-06 12:01:01.000 7.00
000190     2006-02-06 13:01:01.000 6.00
000190     2006-05-06 16:01:01.000 3.00
000190     2006-06-06 17:01:01.000 2.00
000190     2006-06-06 17:01:01.000 2.00
000190     2006-07-06 00:00:00.000 1.00
000190     2006-07-06 19:01:01.000 1.00
000190     2007-07-06 00:00:00.000 1.00
000190     2007-07-06 00:00:00.000 1.00
000190     2007-07-06 00:00:00.000 1.00
000310     2004-10-06 10:01:01.000 10.00
000310     2005-10-06 10:01:01.000 10.00
000310     2005-10-06 10:01:01.000 10.00
000310     2005-10-06 10:01:01.000 10.00
000310     2005-11-06 10:01:01.000 9.00
000310     2005-11-06 10:01:01.000 9.00
000310     2006-01-06 12:01:01.000 7.00
000310     2006-03-06 14:01:01.000 5.00
000310     2006-07-06 19:01:01.000 1.00
000310     2006-10-06 10:01:01.000 10.00

(60 row(s) affected)

Finding the duplicate rows

  • The duplicate rows occur more than once.
  • It's kind of self-evident, but that's how we find the duplicates.

We want to find the duplicate sales. What makes a duplicate sale?

  • The same rep
  • The same date
  • The same quantity
use sql911;
go

select emp_short, 
       sales_date, 
       sum(sales_qty) as 'Totals', 
       count(emp_short) as 'Duplicates'
from sales
group by emp_short, sales_date, sales_qty
having count(emp_short) > 1
order by emp_short;
go
emp_short  sales_date              Totals     Duplicates
---------- ----------------------- ---------- -----------
000010     2005-12-06 11:01:01.000 24.00      3
000010     2006-05-06 16:01:01.000 6.00       2
000010     2007-04-06 00:00:00.000 16.00      4
000030     2006-04-06 15:01:01.000 8.00       2
000030     2007-05-01 00:00:00.000 35.00      7
000110     2006-07-06 19:01:01.000 2.00       2
000190     2006-01-06 12:01:01.000 14.00      2
000190     2006-06-06 17:01:01.000 4.00       2
000190     2007-07-06 00:00:00.000 3.00       3
000310     2005-10-06 10:01:01.000 30.00      3
000310     2005-11-06 10:01:01.000 18.00      2

(11 row(s) affected)
  • You must group by all the columns that make the duplicate.

Deleting manually the duplicate rows

We will be dealing with employee: 000010 with sales for 2007-04-06 and a quantity sold of 4.

select emp_short, 
       sales_date, 
       sales_qty as 'Sold'
from sales
where emp_short = '000010' and
      sales_date = '2007-04-06'
order by emp_short, sales_date;
go
emp_short  sales_date              Sold
---------- ----------------------- -------
000010     2007-04-06 00:00:00.000 4.00
000010     2007-04-06 00:00:00.000 4.00
000010     2007-04-06 00:00:00.000 4.00
000010     2007-04-06 00:00:00.000 4.00

(4 row(s) affected)
use sql911;
go

set rowcount 1
delete from sales 
where emp_short = '000010' and sales_date = '2007-04-06' and sales_qty = 4;
go
(1 row(s) affected)
  • ALWAYS, ALWAYS reselect the database before deleting. Very often, we use the same table names in different databases.
  • SET ROWCOUNT 1 is the number of: "how many rows are left". If you want to delete only 1 row at a time, you would use: SET ROWCOUNT N-1, where N is the total number of duplicates.

Verification of deleting all the duplicate rows

select emp_short, 
       sales_date, 
       sales_qty as 'Sold'
from sales
where emp_short = '000010' and
      sales_date = '2007-04-06';
go
emp_short  sales_date              Sold
---------- ----------------------- ---------------------------------------
000010     2007-04-06 00:00:00.000 4.00

(1 row(s) affected)