Question

srikanth.kasu on Fri, 26 Aug 2016 23:29:28


Hello,

I have a table with one column with mixed data. I would want them to separated into multiple columns. Please help me how to achieve this. Example;

 id       Column1

1        000-00-0000    0.00          John   12/12/2015

2        111-11-1111     1.11         Sam     01/01/2016

result I'm looking for is placing the column1 values into separate columns

id            ssn                interestrate         name             Date

1      000-00-0000            0.00                  John           12/12/2015

2       111-11-1111            1.11                 Sam          01/01/2016

Thanks,


sree


Sponsored



Replies

pituach on Sat, 27 Aug 2016 01:47:12


Hello,

I have a table with one column with mixed data. I would want them to separated into multiple columns. Please help me how to achieve this. Example;

 id       Column1

1        000-00-0000    0.00          John   12/12/2015

2        111-11-1111     1.11         Sam     01/01/2016

result I'm looking for is placing the column1 values into separate columns

id            ssn                interestrate         name             Date

1      000-00-0000            0.00                  John           12/12/2015

2       111-11-1111            1.11                 Sam          01/01/2016

Thanks,


sree

Good day srikanth,

If the data is exactly in the same format then the solution is very simple, but once the format (for example missing part of the partial values or the order is different), then the solution will become much more complex. Let's start with the simple case. Please confirm if this solve your case:

use tempdb Go

------------------------------ Next time please post this part! DDL+DML DROP TABLE IF EXISTS T GO CREATE TABLE T (id INT identity(1,1) primary key , BadStructureAsOneString nvarchar(100)) GO INSERT T (BadStructureAsOneString) values ('000-00-0000 0.00 John 12/12/2015'), ('111-11-1111 1.11 Sam 01/01/2016') GO SELECT * FROM T GO ------------------------- Solution ;With MyCTE as ( SELECT id ,LEFT(BadStructureAsOneString,11) ssn ,[name] = PARSENAME(REPLACE(REPLACE(REPLACE(REPLACE(BadStructureAsOneString,'.','@'),' ','<>'),'><',''),'<>','.'),2) ,interestrate = PARSENAME(REPLACE(REPLACE(REPLACE(REPLACE(BadStructureAsOneString,'.','@'),' ','<>'),'><',''),'<>','.'),3) ,RIGHT(BadStructureAsOneString,10) [Date] FROM T ) select id,ssn,[name], REPLACE(interestrate,'@','.'), [Date] from MyCTE GO


 

signature   Ronen Ariely
 [Personal Site]    [Blog]    [Facebook]    [Linkedin]


pituach on Sat, 27 Aug 2016 01:54:27


Some limitations for the solution above:

1. The use of PARSENAME function fit only for 1-4 items. It fit this question as you have only 4 items in the string. If there is more then we can parse the string in sections.

2. The solution assumes that order of items is the same. If this is not the case then we need different logic. In this case we will need to examine each value that we get from this solution (or any other solution) and check the type.

*** In most cases if the issue is not as simple as here, then it is best to use SQLCLR table function for this issue. 

*** Using SQL Server 2016 we can use JSON by simply adding { at the start and } at the end and replacing the spaces with comma "," instead of dot ".". This way we get simple JSON array, which we can parse very easy.

*** There are infinite options to solve this question! The above is only one of these, here is another one for example:

;With MyCTE as (
	SELECT 
		id 
		,LEFT(BadStructureAsOneString,11) ssn
		,X = RTRIM(LTRIM(SUBSTRING(REPLACE(BadStructureAsOneString,'.','@'),12,LEN(BadStructureAsOneString)-21)))
		,RIGHT(BadStructureAsOneString,10) [Date]
	FROM T
)
select id,ssn, LTRIM(RIGHT(X,LEN(X) - CHARINDEX(' ',X,1))) [name], REPLACE(LEFT(X,CHARINDEX(' ',X,1)),'@','.') interestrate, [Date]
from MyCTE
GO


signature   Ronen Ariely
 [Personal Site]    [Blog]    [Facebook]    [Linkedin]



Sam Zha on Mon, 29 Aug 2016 06:13:24


Hi srikanth.kasu,

You could use script below to get the output as well.

DECLARE @T TABLE (
	id int, 
	Column1 varchar(255)
)

INSERT @T VALUES
(1, '000-00-0000    0.00          John   12/12/2015'),
(2, '111-11-1111     1.11         Sam     01/01/2016')

;WITH CTE AS
(
	SELECT *,  CAST('<x>' + REPLACE(Column1 , ' ', '</x><x>') + '</x>' AS XML) AS Column1_xml
	FROM @T
),
CTE2 AS
(
	SELECT a.*, b.col.value('.', 'varchar(255)') AS [val]
	FROM CTE a
	CROSS APPLY Column1_xml.nodes('/x') b(col)
),
CTE3 AS
(
SELECT id, [val], ROW_NUMBER() OVER(PARTITION BY id ORDER BY (SELECT 0)) AS [row_num]
FROM CTE2
WHERE [val] <> ''
)
SELECT id
	, [1] AS [ssn]
	, [2] AS [interestrate]
	, [3] AS [name]
	, [4] AS [Date]
FROM CTE3
PIVOT (MAX([val]) 
		FOR [row_num] IN ([1], [2], [3], [4])) p

srikanth.kasu on Mon, 29 Aug 2016 18:04:57


I have more than 4 columns, PARSENAME is not the right solution.

srikanth.kasu on Mon, 29 Aug 2016 19:11:50


The column length is dynamic it changes as interestrate value changes, so I cannot use substring. Can you please suggest a better solution which suits my scenario.

srikanth.kasu on Mon, 29 Aug 2016 19:13:38


I think this works fine but it takes forever to run, I have about 300k records and number of records will increase.

pituach on Tue, 30 Aug 2016 03:57:07


Good day srikanth,

In order to cover all the options we need to understand all the options that you have. Since we cannot read minds (at least I cannot), we cannot start a "guessing game", were we waste our time in guessing solution and then you will come with a new information like new format of input.

The ball is in your hand now to re-think about your needs and brings us an exact description of all the optional inputs. In addition please post for each option a specific example, so we will have something to "play with". In short, we need full description of all options and list of examples.

 :-)

pituach on Tue, 30 Aug 2016 11:28:04


Good day srikanth,

Did you read my response above about out guessing game?!?

Anyhow as part of the guessing game please check these simple solutions (using the same DDL+DML that I posted above)

-- For SQL Server 2016 we can use JSON
;with MyCTE as (
	select 
		a = '["'+REPLACE(REPLACE(REPLACE(REPLACE(BadStructureAsOneString,' ','><'),'<>',''),'><',','),',','","')+'"]'
	from T
)
select 
	JSON_VALUE(a, '$[0]') [ssn],
	JSON_VALUE(a, '$[1]') [interestrate],
	JSON_VALUE(a, '$[2]') [name],
	JSON_VALUE(a, '$[3]') [Date]
FROM MyCTE
GO


-- for older server versions
;with MyCTE as (
	select 
		X = convert(XML,'<root>'+ REPLACE(REPLACE('<'+REPLACE(REPLACE(BadStructureAsOneString,' ','/><'),'</>','')+'/>','<','<x>') , '/>','</x>') +'</root>')		
	from T
)
SELECT 
	X.value('(/root/x)[1]', 'varchar(12)' ) [ssn],
	X.value('(/root/x)[2]', 'float' )       [interestrate],
	X.value('(/root/x)[3]', 'varchar(100)' )[name],
	X.value('(/root/x)[4]', 'date' )        [Date]
FROM MyCTE
GO

-- for older server versions if you don't want specific type but flexible
;with MyCTE as (
	select 
		X = convert(XML,'<root>'+ REPLACE(REPLACE('<'+REPLACE(REPLACE(BadStructureAsOneString,' ','/><'),'</>','')+'/>','<','<x>') , '/>','</x>') +'</root>')		
	from T
)
SELECT 
	X.value('(/root/x)[1]', 'varchar(100)') [ssn],
	X.value('(/root/x)[2]', 'varchar(100)')       [interestrate],
	X.value('(/root/x)[3]', 'varchar(100)')[name],
	X.value('(/root/x)[4]', 'varchar(100)')        [Date]
FROM MyCTE
GO

 


signature   Ronen Ariely
 [Personal Site]    [Blog]    [Facebook]    [Linkedin]



scott_morris-ga on Tue, 30 Aug 2016 12:33:10


it takes forever to run, I have about 300k records

Unfortunately bad decisions about the design of the schema result in poor performance trying to generate useful information. And generally speaking, it is unusual to select all rows from a table (unless, of course, your goal is to filter out rows based on the "columns" you generate from your string) - so you need to reconsider your approach and your schema if you want to improve performance.