How to manually delete duplicate rows with MS SQL Server
Often you just have a couple of duplicate rows that you want to delete.
It's much easier and simpler to delete them manually.
But how do you delete all the duplicates except for the one you want to keep?
To delete duplicate rows in bulk see: How to delete duplicate rows in batch
Data used
use sql911;
go
select emp_short,
sales_date,
sales_qty as 'Sold'
from sales
order by emp_short, sales_date;
go
emp_short sales_date Sold ---------- ----------------------- --------------------------------------- 000010 2005-04-06 15:01:01.000 4.00 000010 2005-11-06 10:01:01.000 9.00 000010 2005-12-06 11:01:01.000 8.00 000010 2005-12-06 11:01:01.000 8.00 000010 2005-12-06 11:01:01.000 8.00 000010 2006-04-06 15:01:01.000 4.00 000010 2006-05-06 16:01:01.000 3.00 000010 2006-05-06 16:01:01.000 3.00 000010 2006-07-06 19:01:01.000 1.00 000010 2007-04-06 00:00:00.000 4.00 000010 2007-04-06 00:00:00.000 4.00 000010 2007-04-06 00:00:00.000 4.00 000010 2007-04-06 00:00:00.000 4.00 000010 2007-06-06 17:01:01.000 2.00 000030 2004-02-06 13:01:01.000 6.00 000030 2005-10-06 10:01:01.000 10.00 000030 2006-01-06 12:01:01.000 7.00 000030 2006-04-06 15:01:01.000 4.00 000030 2006-04-06 15:01:01.000 4.00 000030 2006-06-06 17:01:01.000 2.00 000030 2007-05-01 00:00:00.000 5.00 000030 2007-05-01 00:00:00.000 5.00 000030 2007-05-01 00:00:00.000 5.00 000030 2007-05-01 00:00:00.000 5.00 000030 2007-05-01 00:00:00.000 5.00 000030 2007-05-01 00:00:00.000 5.00 000030 2007-05-01 00:00:00.000 5.00 000030 2007-11-06 10:01:01.000 9.00 000110 2004-03-06 14:01:01.000 5.00 000110 2005-11-06 10:01:01.000 9.00 000110 2006-02-06 13:01:01.000 6.00 000110 2006-03-06 14:01:01.000 5.00 000110 2006-05-06 16:01:01.000 3.00 000110 2006-06-06 17:01:01.000 2.00 000110 2006-07-06 19:01:01.000 1.00 000110 2006-07-06 19:01:01.000 1.00 000190 2005-03-06 14:01:01.000 5.00 000190 2005-05-06 16:01:01.000 3.00 000190 2005-12-06 11:01:01.000 8.00 000190 2006-01-06 12:01:01.000 7.00 000190 2006-01-06 12:01:01.000 7.00 000190 2006-02-06 13:01:01.000 6.00 000190 2006-05-06 16:01:01.000 3.00 000190 2006-06-06 17:01:01.000 2.00 000190 2006-06-06 17:01:01.000 2.00 000190 2006-07-06 00:00:00.000 1.00 000190 2006-07-06 19:01:01.000 1.00 000190 2007-07-06 00:00:00.000 1.00 000190 2007-07-06 00:00:00.000 1.00 000190 2007-07-06 00:00:00.000 1.00 000310 2004-10-06 10:01:01.000 10.00 000310 2005-10-06 10:01:01.000 10.00 000310 2005-10-06 10:01:01.000 10.00 000310 2005-10-06 10:01:01.000 10.00 000310 2005-11-06 10:01:01.000 9.00 000310 2005-11-06 10:01:01.000 9.00 000310 2006-01-06 12:01:01.000 7.00 000310 2006-03-06 14:01:01.000 5.00 000310 2006-07-06 19:01:01.000 1.00 000310 2006-10-06 10:01:01.000 10.00 (60 row(s) affected)
Finding the duplicate rows
- The duplicate rows occur more than once.
- It's kind of self-evident, but that's how we find the duplicates.
We want to find the duplicate sales. What makes a duplicate sale?
- The same rep
- The same date
- The same quantity
use sql911;
go
select emp_short,
sales_date,
sum(sales_qty) as 'Totals',
count(emp_short) as 'Duplicates'
from sales
group by emp_short, sales_date, sales_qty
having count(emp_short) > 1
order by emp_short;
go
emp_short sales_date Totals Duplicates ---------- ----------------------- ---------- ----------- 000010 2005-12-06 11:01:01.000 24.00 3 000010 2006-05-06 16:01:01.000 6.00 2 000010 2007-04-06 00:00:00.000 16.00 4 000030 2006-04-06 15:01:01.000 8.00 2 000030 2007-05-01 00:00:00.000 35.00 7 000110 2006-07-06 19:01:01.000 2.00 2 000190 2006-01-06 12:01:01.000 14.00 2 000190 2006-06-06 17:01:01.000 4.00 2 000190 2007-07-06 00:00:00.000 3.00 3 000310 2005-10-06 10:01:01.000 30.00 3 000310 2005-11-06 10:01:01.000 18.00 2 (11 row(s) affected)
- You must group by all the columns that make the duplicate.
Deleting manually the duplicate rows
We will be dealing with employee: 000010 with sales for 2007-04-06 and a quantity sold of 4.
select emp_short,
sales_date,
sales_qty as 'Sold'
from sales
where emp_short = '000010' and
sales_date = '2007-04-06'
order by emp_short, sales_date;
go
emp_short sales_date Sold ---------- ----------------------- ------- 000010 2007-04-06 00:00:00.000 4.00 000010 2007-04-06 00:00:00.000 4.00 000010 2007-04-06 00:00:00.000 4.00 000010 2007-04-06 00:00:00.000 4.00 (4 row(s) affected)
use sql911; go set rowcount 1 delete from sales where emp_short = '000010' and sales_date = '2007-04-06' and sales_qty = 4; go
(1 row(s) affected)
- ALWAYS, ALWAYS reselect the database before deleting. Very often, we use the same table names in different databases.
- SET ROWCOUNT 1 is the number of: "how many rows are left". If you want to delete only 1 row at a time, you would use: SET ROWCOUNT N-1, where N is the total number of duplicates.
Verification of deleting all the duplicate rows
select emp_short,
sales_date,
sales_qty as 'Sold'
from sales
where emp_short = '000010' and
sales_date = '2007-04-06';
go
emp_short sales_date Sold ---------- ----------------------- --------------------------------------- 000010 2007-04-06 00:00:00.000 4.00 (1 row(s) affected)













