Changing quazi-XML into different output

Category: sql server xml

Question

Naomi N on Tue, 05 Mar 2013 17:56:40


Hi everybody,

I am wondering if C# or SQL Server may be better to convert XML like text into a different way.

Here are two row samples we have in our table in the Text type of the column:

<CurrentState>
<guests><guest_no>3004001</guest_no><parent_no>34000001</parent_no><addit_no>0</addit_no><addit_no2>0</addit_no2><trans_no>0</trans_no><guestgroup></guestgroup><salute></salute><first_name>LAURA</first_name><last_name>STONE</last_name><mid_name></mid_name><suffix></suffix><e_mail></e_mail><birth_date></birth_date><gender></gender><mug_shot></mug_shot><mug_date></mug_date><notes></notes><e_message></e_message><check_bx1>false</check_bx1><check_bx2>false</check_bx2><check_bx3>false</check_bx3><check_bx4>false</check_bx4><check_bx5>false</check_bx5><check_bx6>false</check_bx6><check_bx7>false</check_bx7><check_bx8>false</check_bx8><check_bx9>false</check_bx9><check_bx10>false</check_bx10><check_bx11>false</check_bx11><check_bx12>false</check_bx12><check_bx13>false</check_bx13><check_bx14>false</check_bx14><check_bx15>false</check_bx15><number_1>0</number_1><number_2>0</number_2><number_3>0</number_3><number_4>0.00</number_4><number_5>0.00</number_5><text_1></text_1><text_2></text_2><text_3></text_3><text_4></text_4><text_5></text_5><text_6></text_6><text_7></text_7><text_8></text_8><memo_1></memo_1><memo_2></memo_2><memo_3></memo_3><date_1></date_1><datetime_1></datetime_1><web_user></web_user><web_pswd></web_pswd><importid></importid><import2nd></import2nd><importir></importir><importeml></importeml><guest_2nd>0</guest_2nd><salute2></salute2><firstname2></firstname2><lastname2></lastname2><midname2></midname2><suffix2></suffix2><salespoint>TICKET</salespoint><operator>ADMIN</operator><date_time>04/11/2008 10:03:36 AM</date_time><last_mod>0</last_mod><gfwdstatus>0</gfwdstatus><guest_id>0</guest_id><relation>0</relation><role_no>0</role_no><acct_name></acct_name><vipcode>0</vipcode><height>0</height><weight>0</weight><height_m>0</height_m><weight_m>0</weight_m><no_mail>false</no_mail><no_email>false</no_email><no_phone>false</no_phone><addr_pref>0</addr_pref></guests></CurrentState>

-------------------
--Record N0
details:<LiabilityText>Standard Season Pass Liability Text goes here...</LiabilityText>

We need to change the above into something like this (I used a different ID that's why output doesn't match):

Changed:       

CurrentState:  
 guests:        
  guest_no:      5000001
  parent_no:     5000001
  first_name:    MECHELE
  last_name:     LRE
  birth_date:    01/20/1966 12:00:00 AM
  importid:      SIMP5000001
  import2nd:     S2ND5000001
  importir:      SIR5000001
  importeml:     SEML5000001
  firstname2:    JIM
  lastname2:     LRE
  salespoint:    LISA
  operator:      ADMIN
  date_time:     03/17/2008 03:59:39 PM
  last_mod:      6
  gfwdstatus:    1

So, the idea is to remove all empty values and output the rest using the above format and the number of spaces before row equal the tag nesting level. I have VFP function that performs the above display and my challenge is to either convert it to C# code or attempt to get that string formatted properly from the SQL Server directly. There is also a possibility of having bad XML data (tags not closed / matched) in some rows.

What do you think?

Thanks in advance.


For every expert, there is an equal and opposite expert. - Becker's Law


My blog


Replies

Stefan Hoffmann on Tue, 05 Mar 2013 20:22:11


While removing empty tags is not a problem as also are the spaces, I think the challenge are your incomplete tags. I'm not sure right now without digging deeper into it, but I think there are two solutions:

  1. Use a SGML parser, it should be able to handle this.
  2. Use a RegEx clean up run before parsing it as XML.

Naomi N on Wed, 06 Mar 2013 05:41:47


After struggling with it for almost the whole day, I got the following code

http://social.msdn.microsoft.com/Forums/en-US/xmlandnetfx/thread/a1d05853-53da-43da-80ff-4b64a3c6919a

SathyanarrayananS on Wed, 06 Mar 2013 12:54:44


Hi Naomi,

Try this

DECLARE    @x1 XML = '
<CurrentState>
<guests><guest_no>5000001</guest_no><parent_no>5000001</parent_no>
<addit_no>0</addit_no><addit_no2>0</addit_no2><trans_no>0</trans_no>
<guestgroup></guestgroup><salute></salute><first_name>MEC</first_name>
<last_name>LAR</last_name><mid_name></mid_name><suffix></suffix><e_mail></e_mail>
<birth_date>01/20/1966 12:00:00 AM</birth_date><gender></gender>
<mug_shot></mug_shot><notes></notes><e_message></e_message>
<number_2>0</number_2>
<number_3>0</number_3><number_4>.00</number_4><number_5>.00</number_5>
<text_1></text_1><text_2></text_2><text_3></text_3><text_4></text_4><text_5></text_5><text_6>
</text_6><text_7></text_7><text_8></text_8><memo_1></memo_1><memo_2></memo_2><memo_3></memo_3>
<web_user></web_user><web_pswd></web_pswd><importid>SIMP5000001</importid>
<import2nd>S2ND5000001</import2nd><importir>SIR5000001</importir>
<importeml>SEML5000001</importeml><guest_2nd>0</guest_2nd><salute2></salute2>
<firstname2>JIM</firstname2><lastname2>LAR</lastname2>
<midname2></midname2><suffix2></suffix2>
<salespoint>LISA</salespoint><operator>ADMIN</operator>
<date_time>03/17/2008 03:59:39 PM</date_time>
<last_mod>6</last_mod><gfwdstatus>1</gfwdstatus>
<guest_id>0</guest_id><relation>0</relation>
<role_no>0</role_no><acct_name></acct_name>
<vipcode>0</vipcode><height>0</height>
<weight>0</weight><height_m>0</height_m>
<weight_m>0</weight_m><no_mail>false</no_mail><no_email>false</no_email>
<no_phone>false</no_phone><addr_pref>0</addr_pref></guests></CurrentState>";
'
DECLARE 
    @Count INT, 
    @totalCount INT,
    @ElementName VARCHAR(30),
    @ElementValue VARCHAR(30)
-- counter variables
SELECT 
    @Count = 1,
    @totalCount = @x1.value('count(/CurrentState/guests/*)','INT')
    
DECLARE @TMP TABLE(Name VARCHAR(1000),Value VARCHAR(1000))   
-- loop
WHILE @Count <= @totalCount BEGIN
    SELECT
        @ElementName = @x1.value(
            'local-name((/CurrentState/guests/*[position()=sql:variable("@Count")])[1])',
            'VARCHAR(30)'),
        @ElementValue = @x1.value(
            '(/CurrentState/guests/*[position()=sql:variable("@Count")])[1]',
            'VARCHAR(30)')
    
    INSERT @TMP
    SELECT @ElementName,@ElementValue
    --PRINT @ElementName + ':' + @ElementValue
    --PRINT ''
    
    -- increment the counter variable
    SELECT @Count = @Count + 1
END
SELECT * FROM @TMP WHERE Value <> ''

The below link is for just removing XML tags
http://sathyadb.blogspot.in/2012/10/sql-server-remove-xml-tags-from-xml.html

Thanks & Regards, sathya


Naomi N on Wed, 06 Mar 2013 13:03:22


Thanks, but this is not going to work for the generic case I am after. I can not predict the way the details column is saved and what information is in there. In my first message I showed samples of two rows which used completely different content. Although most likely for the method I am working on the content of the row will be like the first row showed.