How to remove multiple blanks with MSSQL
- I have been getting quite a few bad cookies in the logs of the web server.
- The cookie says: 'No cookie set' with the white spaces/blanks and the quotes.
- These cookies came from many different IP addresses.
- I wanted to remove the multiple white spaces/blanks from 'No cookie set' to 'No cookie set'.
- SQL Server has the replace statement, but by itself it only converts: 'No cookie set' to 'No cookie set'. It still has 2 spaces left around the word cookie.
- The secret to removing all the multiple blank spaces is to run replace multiple times, but how many times?
Applies to:
- Microsoft SQL Server 2000
- Microsoft SQL Server 2005
- Microsoft SQL Server 2008
Data used
- USE sql911;
- go
- SELECT TOP 15 CAST(RemoteLogName AS VARCHAR(16)) AS 'From',
- DATETIME,
- CAST(Request AS VARCHAR(64)) AS 'Request',
- StatusCode,BytesSent,
- CAST(Referer AS VARCHAR(96)) AS 'Referer',
- CAST(UserAgent AS VARCHAR(96)) AS 'User agent',
- CAST(Cookie AS VARCHAR(80)) AS 'Cookie'
- FROM logs
- WHERE Cookie IS not null and remotehostname = 'sqlhacks.com';
- go
From DateTime Request StatusCode BytesSent Referer User agent Cookie ---------------- ----------------------- ---------------------------------------------------------------- ----------- ----------- ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ -------------------------------------------------------------------------------- 142.166.3.123 2007-09-02 11:10:48.000 GET /index.php/Optimize/Truncate-Delete HTTP/1.1 200 10248 NULL Jakarta Commons-HttpClient/3.1-beta1 'No cookie set' 59.93.7.244 2007-09-02 11:21:44.000 GET /pub/wsplus/wsplus.css HTTP/1.1 200 3393 http://www.sqlhacks.com/index.php/Summary/Multiple_Group_By Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.4;MEGAU 'No cookie set' 59.93.7.244 2007-09-02 11:21:46.000 GET /pub/skins/fixflow/css/rightbar.css HTTP/1.1 200 787 http://www.sqlhacks.com/index.php/Summary/Multiple_Group_By Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.4;MEGAU 'No cookie set' 142.166.3.123 2007-09-02 12:43:04.000 GET /index.php/Site/AllRecentChanges?action=rss HTTP/1.1 200 4018 NULL Jakarta Commons-HttpClient/3.1-beta1 'No cookie set' 212.242.32.164 2007-09-04 10:29:28.000 GET /index.php/Retrieve/Changed-Row HTTP/1.1 200 13474 "http://www.google.com/search?sourceid=navclient&ie=UTF-8&rls=SUNA SUNA:2006-44 SUNA:en&q=checksum+mssql",Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NE 212.242.32.164 2007-09-04 10:29:34.000 GET /pub/wsplus/csshover.htc HTTP/1.1 200 3071 "http://www.google.com/search?sourceid=navclient&ie=UTF-8&rls=SUNA SUNA:2006-44 SUNA:en&q=checksum+mssql",Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NE 206.188.0.47 2007-09-04 15:52:28.000 GET /index.php/Site/AllRecentChanges?action=rss HTTP/1.1 200 4115 NULL "Mozilla/5.0 (compatible; Konqueror/3.2.3; FreeBSD 4.10-STABLE; X11; i386; en_US) (KHTML like Gecko)", 72.34.230.108 2007-09-04 15:58:45.000 GET / HTTP/1.0 200 19843 NULL "BlogsNowBot V 3.0 (+http://www.blogsnow.com/)", 206.188.0.12 2007-09-03 01:30:33.000 GET /index.php/Site/AllRecentChanges?action=atom HTTP/1.1 200 51274 NULL "Mozilla/5.0 (compatible; Konqueror/3.2.3; FreeBSD 4.10-STABLE; X11; i386; en_US) (KHTML like Gecko)", 206.188.0.16 2007-09-03 01:30:37.000 GET /index.php/Optimize/Index-Rebuild-Reorganize HTTP/1.1 200 10516 NULL "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/522.11 (KHTML like Gecko) Version/3.0.2 Safari/522.12", 72.34.230.108 2007-09-03 01:39:38.000 GET / HTTP/1.0 200 19673 NULL "BlogsNowBot V 3.0 (+http://www.blogsnow.com/)", 206.188.0.10 2007-09-03 18:28:31.000 GET /index.php/Site/AllRecentChanges?action=rss HTTP/1.1 200 4050 NULL "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/522.11 (KHTML like Gecko) Version/3.0.2 Safari/522.12", 206.188.0.10 2007-09-05 17:21:50.000 GET / HTTP/1.1 200 19996 NULL "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/522.11 (KHTML like Gecko) Version/3.0.2 Safari/522.12", 206.188.0.10 2007-09-05 17:21:54.000 GET /index.php/Site/AllRecentChanges?action=rss HTTP/1.1 200 4144 NULL "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/522.11 (KHTML like Gecko) Version/3.0.2 Safari/522.12", 206.188.0.23 2007-09-05 18:56:03.000 GET /index.php/Main/HomePage HTTP/1.1 200 41288 NULL "Mozilla/5.0 (compatible; Konqueror/3.2.3; FreeBSD 4.10-STABLE; X11; i386; en_US) (KHTML like Gecko)", (15 row(s) affected)
- 15 rows are good enough to illustrate the removing or the white spaces/blanks
Removing the white spaces, blanks from text.
- SELECT TOP 15 CAST(RemoteLogName AS VARCHAR(16)) AS 'From',
- DATETIME,
- CAST(Request AS VARCHAR(64)) AS 'Request',
- StatusCode,BytesSent,
- CAST(Referer AS VARCHAR(96)) AS 'Referer',
- CAST(UserAgent AS VARCHAR(96)) AS 'User agent',
- CAST(Cookie AS VARCHAR(80)) AS 'Cookie original',
- LTRIM(REPLACE(Cookie,' ',' ')) AS 'Cookie wrong spacing',
- LTRIM(REPLACE(REPLACE(REPLACE(Cookie,' ',' `````'),'````` ',''),'`````','')) AS 'Cookie w/o extra space'
- FROM logs
- WHERE Cookie IS not null and remotehostname = 'sqlhacks.com';
- go
From DateTime Request StatusCode BytesSent Referer User agent Cookie original Cookie wrong spacing Cookie w/o extra space ---------------- ----------------------- ---------------------------------------------------------------- ----------- ----------- ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ -------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 142.166.3.123 2007-09-02 11:10:48.000 GET /index.php/Optimize/Truncate-Delete HTTP/1.1 200 10248 NULL Jakarta Commons-HttpClient/3.1-beta1 'No cookie set' 'No cookie set' 'No cookie set' 59.93.7.244 2007-09-02 11:21:44.000 GET /pub/wsplus/wsplus.css HTTP/1.1 200 3393 http://www.sqlhacks.com/index.php/Summary/Multiple_Group_By Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.4;MEGAU 'No cookie set' 'No cookie set' 'No cookie set' 59.93.7.244 2007-09-02 11:21:46.000 GET /pub/skins/fixflow/css/rightbar.css HTTP/1.1 200 787 http://www.sqlhacks.com/index.php/Summary/Multiple_Group_By Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.4;MEGAU 'No cookie set' 'No cookie set' 'No cookie set' 142.166.3.123 2007-09-02 12:43:04.000 GET /index.php/Site/AllRecentChanges?action=rss HTTP/1.1 200 4018 NULL Jakarta Commons-HttpClient/3.1-beta1 'No cookie set' 'No cookie set' 'No cookie set' 212.242.32.164 2007-09-04 10:29:28.000 GET /index.php/Retrieve/Changed-Row HTTP/1.1 200 13474 "http://www.google.com/search?sourceid=navclient&ie=UTF-8&rls=SUNA SUNA:2006-44 SUNA:en&q=checksum+mssql",Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NE SUNA:en&q=checksum+mssql",Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; Cybercity/IE; Cybercity/IE), SUNA:en&q=checksum+mssql",Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; Cybercity/IE; Cybercity/IE), 212.242.32.164 2007-09-04 10:29:34.000 GET /pub/wsplus/csshover.htc HTTP/1.1 200 3071 "http://www.google.com/search?sourceid=navclient&ie=UTF-8&rls=SUNA SUNA:2006-44 SUNA:en&q=checksum+mssql",Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NE SUNA:en&q=checksum+mssql",Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; Cybercity/IE; Cybercity/IE), SUNA:en&q=checksum+mssql",Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; Cybercity/IE; Cybercity/IE), 206.188.0.47 2007-09-04 15:52:28.000 GET /index.php/Site/AllRecentChanges?action=rss HTTP/1.1 200 4115 NULL "Mozilla/5.0 (compatible; Konqueror/3.2.3; FreeBSD 4.10-STABLE; X11; i386; en_US) (KHTML like Gecko)", like Gecko)", like Gecko)", 72.34.230.108 2007-09-04 15:58:45.000 GET / HTTP/1.0 200 19843 NULL "BlogsNowBot V 3.0 (+http://www.blogsnow.com/)", V 3.0 (+http://www.blogsnow.com/)", V 3.0 (+http://www.blogsnow.com/)", 206.188.0.12 2007-09-03 01:30:33.000 GET /index.php/Site/AllRecentChanges?action=atom HTTP/1.1 200 51274 NULL "Mozilla/5.0 (compatible; Konqueror/3.2.3; FreeBSD 4.10-STABLE; X11; i386; en_US) (KHTML like Gecko)", like Gecko)", like Gecko)", 206.188.0.16 2007-09-03 01:30:37.000 GET /index.php/Optimize/Index-Rebuild-Reorganize HTTP/1.1 200 10516 NULL "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/522.11 (KHTML like Gecko) Version/3.0.2 Safari/522.12", like Gecko) Version/3.0.2 Safari/522.12", like Gecko) Version/3.0.2 Safari/522.12", 72.34.230.108 2007-09-03 01:39:38.000 GET / HTTP/1.0 200 19673 NULL "BlogsNowBot V 3.0 (+http://www.blogsnow.com/)", V 3.0 (+http://www.blogsnow.com/)", V 3.0 (+http://www.blogsnow.com/)", 206.188.0.10 2007-09-03 18:28:31.000 GET /index.php/Site/AllRecentChanges?action=rss HTTP/1.1 200 4050 NULL "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/522.11 (KHTML like Gecko) Version/3.0.2 Safari/522.12", like Gecko) Version/3.0.2 Safari/522.12", like Gecko) Version/3.0.2 Safari/522.12", 206.188.0.10 2007-09-05 17:21:50.000 GET / HTTP/1.1 200 19996 NULL "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/522.11 (KHTML like Gecko) Version/3.0.2 Safari/522.12", like Gecko) Version/3.0.2 Safari/522.12", like Gecko) Version/3.0.2 Safari/522.12", 206.188.0.10 2007-09-05 17:21:54.000 GET /index.php/Site/AllRecentChanges?action=rss HTTP/1.1 200 4144 NULL "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/522.11 (KHTML like Gecko) Version/3.0.2 Safari/522.12", like Gecko) Version/3.0.2 Safari/522.12", like Gecko) Version/3.0.2 Safari/522.12", 206.188.0.23 2007-09-05 18:56:03.000 GET /index.php/Main/HomePage HTTP/1.1 200 41288 NULL "Mozilla/5.0 (compatible; Konqueror/3.2.3; FreeBSD 4.10-STABLE; X11; i386; en_US) (KHTML like Gecko)", like Gecko)", like Gecko)", (15 row(s) affected)
- Line 8: ltrim(replace(Cookie,' ',' ')) as you can see, it produces 'No cookie set' with 2 spaces around the word cookie. We need the equivalent of a recursion. Replace does replace all the occurrences of 2 spaces with 1 space, but what about when you have 2 consecutive occurrences of 2 spaces [4 spaces] then it becomes 2 spaces. That's why just a single replace statement does not work.
- Line 9: ltrim(replace(replace(replace(Cookie,' ',' `````'),'````` ',),'`````',)) does produce the correct result of 'No cookie set' with only 1 space around the word cookie.
- Why use the back-tick ` 5 times?: Because it's a combination that does not exist in my data. It could anything else as long as it's not a T-SQL reserved character [wildcard...]. The 5 back-ticks are just a place holder.
- Why the 3 replaces?: Each one is a variation that replaces the 2 spaces. I introduced the back-tick to keep the spot for a space, so the combination is 2 of them, or 1 space in front or 1 space after.
- Why the ltrim?: In case that the spaces are at the beginning.

