Groups v. Distinct with SQL Server

  • In SQL, including SQL Server, groups by definition must have at least 1 row.
  • In SQL, including SQL Server, groups are always distinct.

Summary

  • Never use DISTINCT in SELECT queries if you are using the GROUP BY clause.
  • The combination of groups and distinct will seriously slow down large and complex queries.
  • If possible do not use either DISTINCT or GROUP BY in very large queries.

Data used

select projno,projname,deptno,respemp,prstaff,prstdate,prendate from project;

projno projname                 deptno respemp prstaff                                 prstdate                prendate
------ ------------------------ ------ ------- --------------------------------------- ----------------------- -----------------------
AD3100 ADMIN SERVICES           D01    000010  6.50                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
AD3110 GENERAL ADMINSYSTEMS     D21    000070  6.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
AD3111 PAYROLL PROGRAMMING      D21    000230  2.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
AD3112 PERSONNEL PROGRAMMING    D21    000250  1.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
AD3113 ACCOUNT PROGRAMMING      D21    000270  2.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
IF1000 QUERY SERVICES           C01    000030  2.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
IF2000 USER EDUCATION           C01    000030  1.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
MA2100 WELDLINE AUTOMATION      D01    000010  12.00                                   2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
MA2110 W L PROGRAMMING          D11    000060  9.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
MA2111 W L PROGRAM DESIGN       D11    000220  2.00                                    2005-01-01 00:00:00.000 2005-12-01 00:00:00.000
MA2112 W L ROBOT DESIGN         D11    000150  3.00                                    2005-01-01 00:00:00.000 2005-12-01 00:00:00.000
MA2113 W L PRODCONT PROGS       D11    000160  3.00                                    2005-02-15 00:00:00.000 2005-12-01 00:00:00.000
OP1000 OPERATION SUPPORT        E01    000050  6.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
OP1010 OPERATION                E11    000090  5.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
OP2000 GEN SYSTEM SSERVICES     E01    000050  5.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
OP2010 SYSTEMS SUPPORT          E21    000100  4.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
OP2011 SCP SYSTEMS SUPPORT      E21    000320  1.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
OP2012 APPLICATIONS SUPPORT     E21    000330  1.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
OP2013 DB/DCSUPPORT             E21    000340  1.00                                    2005-01-01 00:00:00.000 2006-02-01 00:00:00.000
PL2100 WELD LINE PLANNING       B01    000020  1.00                                    2005-01-01 00:00:00.000 2005-09-15 00:00:00.000

(20 row(s) affected)

select deptno from project;

deptno
------
D01
D21
D21
D21
D21
C01
C01
D01
D11
D11
D11
D11
E01
E11
E01
E21
E21
E21
E21
B01

(20 row(s) affected)

This represents all the departments and how many projects there are for each department. Each entry of a department represents a project.

GROUP BY only

  • This will give the number of project for each department.
select deptno, count(deptno) as 'Number of projects' 
from project 
GROUP BY deptno;

deptno Number of projects
------ ------------------
B01    1
C01    2
D01    2
D11    4
D21    4
E01    2
E11    1
E21    4

(8 row(s) affected)

GROUP BY with DISTINCT

select DISTINCT deptno, count(deptno) as 'Number of projects' 
from project 
GROUP BY deptno;

deptno Number of projects
------ ------------------
B01    1
C01    2
D01    2
D11    4
D21    4
E01    2
E11    1
E21    4

(8 row(s) affected)
  • Please note that the results are IDENTICAL to the GROUP BY without the distinct.
  • DISTINCT implies a GROUP BY.

DISTINCT only

select DISTINCT deptno, count(deptno) as 'Number of projects' 
from project;

deptno Number of projects
------ ------------------
B01    1
C01    2
D01    2
D11    4
D21    4
E01    2
E11    1
E21    4

(8 row(s) affected)
  • Please note that the results are 100% IDENTICAL to the GROUP BY.
  • DISTINCT always implies a GROUP BY.