What to do when not all items in select clause are in the Group By
Problem
All items in the select clause must be in the group by clause, except for the aggregate functions, such as MIN,MAX,SUM,COUNT...
- SELECT projno,projname,COUNT(projno)
- FROM project
- GROUP BY projno;
You get the following error message:
Msg 8120, Level 16, State 1, Line 1 Column 'project.PROJNAME' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Data used
- SELECT sales_date AS 'Date', sales_rep AS 'Rep', sales_region AS 'Region', sales_num AS 'Sales'
- FROM sales02;
Date Rep Region Sales ----------------------- ------------------------- ------------------------- ----------- 2005-12-31 00:00:00.000 LUCCHESSI Ontario-South 1 2005-12-31 00:00:00.000 LEE Ontario-South 3 2005-12-31 00:00:00.000 LEE Quebec 1 2005-12-31 00:00:00.000 LEE Manitoba 2 2005-12-31 00:00:00.000 GOUNOT Quebec 1 2006-03-29 00:00:00.000 LUCCHESSI Ontario-South 3 2006-03-29 00:00:00.000 LUCCHESSI Quebec 1 2006-03-29 00:00:00.000 LEE Ontario-South 2 2006-03-29 00:00:00.000 LEE Ontario-North 2 2006-03-29 00:00:00.000 LEE Quebec 3 2006-03-29 00:00:00.000 LEE Manitoba 5 2006-03-29 00:00:00.000 GOUNOT Ontario-South 3 2006-03-29 00:00:00.000 GOUNOT Quebec 1 2006-03-29 00:00:00.000 GOUNOT Manitoba 7 2006-03-30 00:00:00.000 LUCCHESSI Ontario-South 1 2006-03-30 00:00:00.000 LUCCHESSI Quebec 2 2006-03-30 00:00:00.000 LUCCHESSI Manitoba 1 2006-03-30 00:00:00.000 LEE Ontario-South 7 2006-03-30 00:00:00.000 LEE Ontario-North 3 2006-03-30 00:00:00.000 LEE Quebec 7 2006-03-30 00:00:00.000 LEE Manitoba 4 2006-03-30 00:00:00.000 GOUNOT Ontario-South 2 2006-03-30 00:00:00.000 GOUNOT Quebec 18 2006-03-30 00:00:00.000 GOUNOT Manitoba 1 2006-03-31 00:00:00.000 LUCCHESSI Manitoba 1 2006-03-31 00:00:00.000 LEE Ontario-South 0 2006-03-31 00:00:00.000 LEE Ontario-North 3 2006-03-31 00:00:00.000 LEE Quebec 7 2006-03-31 00:00:00.000 LEE Manitoba 3 2006-03-31 00:00:00.000 GOUNOT Ontario-South 2 2006-03-31 00:00:00.000 GOUNOT Quebec 1 2006-04-01 00:00:00.000 LUCCHESSI Ontario-South 3 2006-04-01 00:00:00.000 LUCCHESSI Manitoba 1 2006-04-01 00:00:00.000 LEE Ontario-South 8 2006-04-01 00:00:00.000 LEE Ontario-North 11 2006-04-01 00:00:00.000 LEE Quebec 8 2006-04-01 00:00:00.000 LEE Manitoba 9 2006-04-01 00:00:00.000 GOUNOT Ontario-South 3 2006-04-01 00:00:00.000 GOUNOT Ontario-North 1 2006-04-01 00:00:00.000 GOUNOT Quebec 3 2006-04-01 00:00:00.000 GOUNOT Manitoba 7 (41 row(s) affected)
- SELECT sales_rep, sales_region, SUM(sales_num)
- FROM sales02
- GROUP BY sales_rep;
Msg 8120, Level 16, State 1, Line 1 Column 'sales02.sales_region' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
All items in the Group By
- Place all the items of the select clause in the Group By clause.
- This will show the sales by sales rep and by region.
- SELECT sales_rep, sales_region, SUM(sales_num) AS 'Num of Sales'
- FROM sales02
- GROUP BY sales_rep, sales_region;
sales_rep sales_region Num of Sales ------------------------- ------------------------- ----------- GOUNOT Manitoba 15 LEE Manitoba 23 LUCCHESSI Manitoba 3 GOUNOT Ontario-North 1 LEE Ontario-North 19 GOUNOT Ontario-South 10 LEE Ontario-South 20 LUCCHESSI Ontario-South 8 GOUNOT Quebec 24 LEE Quebec 26 LUCCHESSI Quebec 3 (11 row(s) affected)
- If you do not have an index that exactly match the group by, this will force a physical sort.
- On hundreds of thousands rows, this will take forever."
Only 1 item in the Group By
- SELECT sales_rep, MIN(sales_region) AS 'Region', SUM(sales_num) AS 'Num of Sales'
- FROM sales02
- GROUP BY sales_rep;
sales_rep Region Num of Sales ------------------------- ------------------------- ------------ GOUNOT Manitoba 50 LEE Manitoba 88 LUCCHESSI Manitoba 14 (3 row(s) affected)
- SELECT sales_rep, MAX(sales_region) AS 'Region', SUM(sales_num) AS 'Num of Sales'
- FROM sales02
- GROUP BY sales_rep;
sales_rep Region Num of Sales ------------------------- ------------------------- ------------ GOUNOT Manitoba 50 LEE Manitoba 88 LUCCHESSI Manitoba 14 (3 row(s) affected)
- The results are identical.
- Doing a max(region) or min(region) is not very useful, but if instead of using the region, we replace it with customer IDs/sales IDs..., then it becomes very useful,
- Microsoft SQL Server does not need to a physical sort if there is an index for the group by
- If the table is small enough, MS SQL Server will always do a physical sort, since it's faster than going through the index.
Alternative
- SELECT sales_rep, sales_rep + ' sold:' AS 'Reps', SUM(sales_num) AS 'Num of Sales'
- FROM sales02
- GROUP BY sales_rep;
sales_rep Reps Num of Sales ------------------------- ------------------------------- ------------ GOUNOT GOUNOT sold: 50 LEE LEE sold: 88 LUCCHESSI LUCCHESSI sold: 14 (3 row(s) affected)

