Why are my averages wrong with MSSQL?
- SQL ignores NULL values when doing aggregate operations.
- The most affected functions are count, average, min.
Applies to:
- Microsoft SQL Server 2000
- Microsoft SQL Server 2005
- Microsoft SQL Server 2008
Data used
USE sql911;
go
SELECT sales_date, sales_qty, sales_price, emp_short
FROM sales;
go
go
SELECT sales_date, sales_qty, sales_price, emp_short
FROM sales;
go
sales_date sales_qty sales_price emp_short ----------------------- --------------------------------------- --------------------------------------- ---------- 2005-10-06 10:01:01.000 10.00 11.75 000310 2006-10-06 10:01:01.000 10.00 11.75 000310 2005-11-06 10:01:01.000 9.00 11.75 000110 2006-10-06 10:01:01.000 10.00 11.75 000310 2005-12-06 11:01:01.000 8.00 11.75 000010 2006-02-06 13:01:01.000 NULL NULL 000190 2006-03-06 14:01:01.000 5.00 11.75 000110 2006-10-06 10:01:01.000 10.00 11.75 000190 2006-10-06 10:01:01.000 10.00 11.75 000190 2006-04-06 15:01:01.000 4.00 11.75 000030 2006-05-06 16:01:01.000 3.00 11.75 000010 2004-10-06 10:01:01.000 NULL NULL 000310 2005-11-06 10:01:01.000 9.00 11.75 000010 2006-06-06 17:01:01.000 2.00 11.75 000190 2006-01-06 12:01:01.000 7.00 11.75 000310 2006-02-06 13:01:01.000 6.00 11.75 000110 2005-03-06 14:01:01.000 5.00 11.75 000190 2006-10-06 10:01:01.000 10.00 11.75 000110 2006-10-06 10:01:01.000 3.00 11.75 000110 2007-06-06 17:01:01.000 2.00 11.75 000010 2007-04-06 00:00:00.000 4.00 11.75 000010 2007-05-01 00:00:00.000 5.00 11.75 000030 2007-07-06 00:00:00.000 1.00 11.75 000190 2007-05-03 09:00:00.000 3.00 11.75 000010 2007-05-05 04:00:00.000 1.00 11.75 000010 2006-10-06 10:01:01.000 10.00 11.75 000030 2004-03-06 14:01:01.000 NULL NULL 000110 2005-04-06 15:01:01.000 4.00 11.75 000010 2005-05-06 16:01:01.000 3.00 11.75 000190 2006-06-06 17:01:01.000 2.00 11.75 000030 2006-07-06 19:01:01.000 1.00 11.75 000310 2006-03-06 14:01:01.000 5.00 11.75 000310 2006-04-06 15:01:01.000 4.00 11.75 000010 2007-05-07 11:00:00.000 2.00 11.75 000010 2006-06-06 17:01:01.000 NULL NULL 000110 2006-07-06 19:01:01.000 1.00 11.75 000190 2005-10-06 10:01:01.000 10.00 11.75 000030 2007-05-08 15:00:00.000 1.50 11.75 000010 2005-12-06 11:01:01.000 8.00 11.75 000190 2006-01-06 12:01:01.000 7.00 11.75 000030 2007-05-07 11:30:00.000 2.00 10.50 000030 2007-05-08 13:30:00.000 1.00 10.00 000030 2006-07-06 19:01:01.000 1.00 11.75 000010 2007-05-09 14:00:00.000 1.25 10.00 000110 2007-11-06 10:01:01.000 9.00 11.75 000030 2007-05-10 14:00:00.000 5.00 10.00 000190 2007-05-11 14:15:00.000 4.00 10.00 000190 2007-05-12 15:00:00.000 2.00 10.00 000190 2007-05-13 16:00:00.000 1.00 10.00 000190 2007-05-08 05:00:00.000 1.00 10.00 000310 2007-05-07 00:00:00.000 2.00 10.00 000310 2007-05-14 11:00:00.000 3.00 10.00 000310 2006-07-06 00:00:00.000 1.00 11.75 000190 2007-05-11 13:00:00.000 3.50 10.00 000310 2007-05-08 05:00:00.000 1.00 10.00 000010 2007-05-08 07:00:00.000 2.00 10.00 000010 2007-05-08 11:00:00.000 3.00 10.00 000010 (57 row(s) affected)
Please note the following data:
SELECT sales_date, sales_qty, sales_price, emp_short
FROM sales
WHERE sales_qty IS null;
go
FROM sales
WHERE sales_qty IS null;
go
sales_date sales_qty sales_price emp_short ----------------------- --------------------------------------- --------------------------------------- ---------- 2006-02-06 13:01:01.000 NULL NULL 000190 2004-10-06 10:01:01.000 NULL NULL 000310 2004-03-06 14:01:01.000 NULL NULL 000110 2006-06-06 17:01:01.000 NULL NULL 000110 (4 row(s) affected)
Wrong average with MSSQL
SELECT AVG(sales_qty) AS 'Wrong average'
FROM sales;
go
FROM sales;
go
Wrong average --------------------------------------- 4.495283 Warning: Null value is eliminated by an aggregate or other SET operation. (1 row(s) affected)
- Look at the warning message: Null value is eliminated by an aggregate or other SET operation.
- We are talking about a simple arithmetic average defined as the sum of the sales_qty divided by the number of rows.
- The sum of the sales_qty is OK if there is any null value, because in the real world we convert the null to zero.
- The number of rows in the sales_qty without null values from 57 to 53, which raises the average.
SELECT AVG(sales_qty) AS 'Average no null'
FROM sales
WHERE sales_qty IS not null;
go
FROM sales
WHERE sales_qty IS not null;
go
Average no null --------------------------------------- 4.495283 (1 row(s) affected)
Clearer average with MSSQL
SELECT SUM(sales_qty)/COUNT(*) AS 'Average with nulls',
SUM(sales_qty) AS 'Qty sum',
COUNT(*) AS 'Rows Counted',
COUNT(sales_qty) AS 'Quantity counted'
FROM sales;
go
SUM(sales_qty) AS 'Qty sum',
COUNT(*) AS 'Rows Counted',
COUNT(sales_qty) AS 'Quantity counted'
FROM sales;
go
Average with nulls Qty sum Rows Counted Quantity counted --------------------------------------- --------------------------------------- ------------ ---------------- 4.179824 238.25 57 53 Warning: Null value is eliminated by an aggregate or other SET operation. (1 row(s) affected)
SELECT SUM(COALESCE(sales_qty,0))/COUNT(*) AS 'Average with nulls',
SUM(COALESCE(sales_qty,0)) AS 'Qty sum coalesce 0',
SUM(COALESCE(sales_qty,6)) AS 'Qty sum coalesce 6',
COUNT(*) AS 'Counted'
FROM sales;
go
SUM(COALESCE(sales_qty,0)) AS 'Qty sum coalesce 0',
SUM(COALESCE(sales_qty,6)) AS 'Qty sum coalesce 6',
COUNT(*) AS 'Counted'
FROM sales;
go
Average with nulls Qty sum coalesce 0 Qty sum coalesce 6 Counted --------------------------------------- --------------------------------------- --------------------------------------- ----------- 4.179824 238.25 262.25 57 (1 row(s) affected)
- Make sure that you are aware of SQL Server discarding NULL values when doing aggregates.
- Use coalesce to convert the nulls to a value acceptable to you.
- Watch for MIN, AVG and COUNT.

