How to select every 12345th row with MSSQL

  • For demo purpose or testing, you want to select a small subset of the very large dataset.
  • Why 1 row every 12345th row? It's just a number picked from the keyboard.
  • The advantage of that method is that the sampled data is spread across the whole table instead of the just the top 100 rows.

Applies to:

  • Microsoft SQL Server 2005
  • Microsoft SQL Server 2008

Data used

  1. USE sql911;
  2. go
  3. SELECT num FROM nums;
  4. go
num
-----------
1
2
3
4
5
6
7
8
9
10

...

1048572
1048573
1048574
1048575
1048576
1048577

(1048577 row(s) affected)


I have truncated the listing because it's 1 million rows!

Selecting the 12345th row

  1. SELECT * FROM (
  2.         SELECT num, row_number() OVER (ORDER BY num) rownum
  3.         FROM nums
  4. ) abc WHERE rownum % 12345 = 0;
  5. go
Num         rownum
----------- --------------------
12345       12345
24690       24690
37035       37035
49380       49380
61725       61725
74070       74070
86415       86415
98760       98760
111105      111105
123450      123450
135795      135795
148140      148140
160485      160485
172830      172830
185175      185175
197520      197520
209865      209865
222210      222210
234555      234555
246900      246900
259245      259245
271590      271590
283935      283935
296280      296280
308625      308625
320970      320970
333315      333315
345660      345660
358005      358005
370350      370350
382695      382695
395040      395040
407385      407385
419730      419730
432075      432075
444420      444420
456765      456765
469110      469110
481455      481455
493800      493800
506145      506145
518490      518490
530835      530835
543180      543180
555525      555525
567870      567870
580215      580215
592560      592560
604905      604905
617250      617250
629595      629595
641940      641940
654285      654285
666630      666630
678975      678975
691320      691320
703665      703665
716010      716010
728355      728355
740700      740700
753045      753045
765390      765390
777735      777735
790080      790080
802425      802425
814770      814770
827115      827115
839460      839460
851805      851805
864150      864150
876495      876495
888840      888840
901185      901185
913530      913530
925875      925875
938220      938220
950565      950565
962910      962910
975255      975255
987600      987600
999945      999945
1012290     1012290
1024635     1024635
1036980     1036980

(84 row(s) affected)

  1. Assign a row number with the row_number function see: How to number rows without using cursors with MS SQL Server for more detailed explanations.
  2. Use the modulus operator %. The % modulus is the remainder when dividing one from the other: rownum % 12345 = 0. This will be true when rownum = 12345 (12345 x 1) or rownum = 24690 (12345 x 2) or rownum = 37035 (12345 x 3)...
  3. I reduced the output from 1 million rows to 84 rows.
  4. It's very different from doing a select top 100 because the data is spread out across the whole table, while select top 100 will select 100 rows that are data related.
  • This is NOT random sampling because every time you run that query, you will get the same rows.