Groups v. Distinct with SQL Server
- In SQL, including SQL Server, groups by definition must have at least 1 row.
- In SQL, including SQL Server, groups are always distinct.
Summary
- Never use DISTINCT in SELECT queries if you are using the GROUP BY clause.
- The combination of groups and distinct will seriously slow down large and complex queries.
- If possible do not use either DISTINCT or GROUP BY in very large queries.
Data used
select projno,projname,deptno,respemp,prstaff,prstdate,prendate from project; projno projname deptno respemp prstaff prstdate prendate ------ ------------------------ ------ ------- --------------------------------------- ----------------------- ----------------------- AD3100 ADMIN SERVICES D01 000010 6.50 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 AD3110 GENERAL ADMINSYSTEMS D21 000070 6.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 AD3111 PAYROLL PROGRAMMING D21 000230 2.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 AD3112 PERSONNEL PROGRAMMING D21 000250 1.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 AD3113 ACCOUNT PROGRAMMING D21 000270 2.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 IF1000 QUERY SERVICES C01 000030 2.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 IF2000 USER EDUCATION C01 000030 1.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 MA2100 WELDLINE AUTOMATION D01 000010 12.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 MA2110 W L PROGRAMMING D11 000060 9.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 MA2111 W L PROGRAM DESIGN D11 000220 2.00 2005-01-01 00:00:00.000 2005-12-01 00:00:00.000 MA2112 W L ROBOT DESIGN D11 000150 3.00 2005-01-01 00:00:00.000 2005-12-01 00:00:00.000 MA2113 W L PRODCONT PROGS D11 000160 3.00 2005-02-15 00:00:00.000 2005-12-01 00:00:00.000 OP1000 OPERATION SUPPORT E01 000050 6.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 OP1010 OPERATION E11 000090 5.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 OP2000 GEN SYSTEM SSERVICES E01 000050 5.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 OP2010 SYSTEMS SUPPORT E21 000100 4.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 OP2011 SCP SYSTEMS SUPPORT E21 000320 1.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 OP2012 APPLICATIONS SUPPORT E21 000330 1.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 OP2013 DB/DCSUPPORT E21 000340 1.00 2005-01-01 00:00:00.000 2006-02-01 00:00:00.000 PL2100 WELD LINE PLANNING B01 000020 1.00 2005-01-01 00:00:00.000 2005-09-15 00:00:00.000 (20 row(s) affected)
select deptno from project; deptno ------ D01 D21 D21 D21 D21 C01 C01 D01 D11 D11 D11 D11 E01 E11 E01 E21 E21 E21 E21 B01 (20 row(s) affected)
This represents all the departments and how many projects there are for each department. Each entry of a department represents a project.
GROUP BY only
- This will give the number of project for each department.
select deptno, count(deptno) as 'Number of projects' from project GROUP BY deptno; deptno Number of projects ------ ------------------ B01 1 C01 2 D01 2 D11 4 D21 4 E01 2 E11 1 E21 4 (8 row(s) affected)
GROUP BY with DISTINCT
select DISTINCT deptno, count(deptno) as 'Number of projects' from project GROUP BY deptno; deptno Number of projects ------ ------------------ B01 1 C01 2 D01 2 D11 4 D21 4 E01 2 E11 1 E21 4 (8 row(s) affected)
- Please note that the results are IDENTICAL to the GROUP BY without the distinct.
- DISTINCT implies a GROUP BY.
DISTINCT only
select DISTINCT deptno, count(deptno) as 'Number of projects' from project; deptno Number of projects ------ ------------------ B01 1 C01 2 D01 2 D11 4 D21 4 E01 2 E11 1 E21 4 (8 row(s) affected)
- Please note that the results are 100% IDENTICAL to the GROUP BY.
- DISTINCT always implies a GROUP BY.

