Thursday, March 22, 2012

Converting a file into multiple tables

This file format has multiple levels (X12).
One level could have one or more instances of the
next level contained within it. Kind of like XML,
except that some sections have no end tags, and the
ones that do have end tags actually have a _different_
tag for the end. (ISA ...IEA or GS ... GE)

It's easy enough to read a line at a time, see what
type it is, and insert its parts into the appropriate
table. Keeping track of the keys of the parent level
for relationships.

But I'm wandering whether there's some (not impossibly
complex) more efficient method with SQL and/or DTS.

--
Wes Groleau

If you put garbage in a computer nothing comes out but garbage.
But this garbage, having passed through a very expensive machine,
is somehow ennobled and none dare criticize it.Take a look at SQLXML Bulk Load
(http://msdn2.microsoft.com/en-us/library/ms171993.aspx).

--
Hope this helps.

Dan Guzman
SQL Server MVP

"Wes Groleau" <groleau+news@.freeshell.orgwrote in message
news:K7x6h.10720$l%2.2462@.trnddc05...

Quote:

Originally Posted by

This file format has multiple levels (X12).
One level could have one or more instances of the
next level contained within it. Kind of like XML,
except that some sections have no end tags, and the
ones that do have end tags actually have a _different_
tag for the end. (ISA ...IEA or GS ... GE)
>
It's easy enough to read a line at a time, see what
type it is, and insert its parts into the appropriate
table. Keeping track of the keys of the parent level
for relationships.
>
But I'm wandering whether there's some (not impossibly
complex) more efficient method with SQL and/or DTS.
>
--
Wes Groleau
>
If you put garbage in a computer nothing comes out but garbage.
But this garbage, having passed through a very expensive machine,
is somehow ennobled and none dare criticize it.

|||Dan Guzman wrote:

Quote:

Originally Posted by

Take a look at SQLXML Bulk Load
(http://msdn2.microsoft.com/en-us/library/ms171993.aspx).


He'd also need an EDI to XML translator. (I recognize those
damnable start/end tags.) Google indicates that several
translators exist; anyone want to offer a recommendation?|||You're right about the EDI to XML translator - I misread Wes's post. of
course, SQLXML can't consume EDI directly.

--
Hope this helps.

Dan Guzman
SQL Server MVP

"Ed Murphy" <emurphy42@.socal.rr.comwrote in message
news:PCH6h.708$Fg.683@.tornado.socal.rr.com...

Quote:

Originally Posted by

Dan Guzman wrote:
>

Quote:

Originally Posted by

>Take a look at SQLXML Bulk Load
>(http://msdn2.microsoft.com/en-us/library/ms171993.aspx).


>
He'd also need an EDI to XML translator. (I recognize those
damnable start/end tags.) Google indicates that several
translators exist; anyone want to offer a recommendation?

|||Dan Guzman wrote:

Quote:

Originally Posted by

You're right about the EDI to XML translator - I misread Wes's post. of
course, SQLXML can't consume EDI directly.


Of course, I could easily write something to
convert it to XML that SQL server can read.
But then I could just as easily convert it
directly into INSERT statements. I'm just
wondering whether DTS or anything else is faster.

I already have a tool that loads the entire
file into an array of lines and provides various
query functions for other apps to access it.

But I'd like to put multiple files in the database
instead of having to select one file at a time.

By the way, whatever the technique is, it could
probably also handle GEDCOM files.

--
Wes Groleau

There ain't no right wing,
there ain't no left wing.
There's only you and me and we just disagree.
(apologies to Jim Krueger)|||Ed Murphy wrote:

Quote:

Originally Posted by

Dan Guzman wrote:

Quote:

Originally Posted by

>Take a look at SQLXML Bulk Load
>(http://msdn2.microsoft.com/en-us/library/ms171993.aspx).


>
He'd also need an EDI to XML translator. (I recognize those
damnable start/end tags.) Google indicates that several
translators exist; anyone want to offer a recommendation?


I think I figured out a solution (haven't tried it yet).

Comments on this idea welcome (I'm kind of new to SQL):

The X12 files and GEDCOM files (maybe HL7, too?) have
multiple levels. Generally, each "level X" record
may own more than one record on level X+1

So if a file has (data elem delims changed to spaces)
....
CLP A B C
SVC X Y Z
SVC 1 2 3
CLP D E F
SVC P Q R
SVC 5 6 7
....
then the first pass through the file could create rows

.... A B C X Y Z ...
.... A B C 1 2 3 ...
.... D E F P Q R ...
.... D E F 5 6 7 ...

Next, one query could SELECT DISTINCT to give

.... A B C
.... D E F

while another could SELECT for

.... A X Y Z ...
.... A 1 2 3 ...
.... D P Q R ...
.... D 5 6 7 ...

and the same strategy could be used on each adjacent pair of levels.

Right ?

--
Wes Groleau

He that is good for making excuses, is seldom good for anything else.
-- Benjamin Franklin|||It's true that you can transform EDI and GEDCOM files directly into
relational format. I think the reason XML is commonly used as an
intermediate format is that XML is perfect for hierarchical data and you can
leverage a high-performance XML import utility like SQLXML without writing
additional code. Although it will take a while, I expect XML will
eventually replace both EDI and GEDCOM formats. You'll be a step ahead if
you can process XML too.

--
Hope this helps.

Dan Guzman
SQL Server MVP

"Wes Groleau" <groleau+news@.freeshell.orgwrote in message
news:uJv7h.211$9e.25@.trnddc02...

Quote:

Originally Posted by

Ed Murphy wrote:

Quote:

Originally Posted by

>Dan Guzman wrote:

Quote:

Originally Posted by

>>Take a look at SQLXML Bulk Load
>>(http://msdn2.microsoft.com/en-us/library/ms171993.aspx).


>>
>He'd also need an EDI to XML translator. (I recognize those
>damnable start/end tags.) Google indicates that several
>translators exist; anyone want to offer a recommendation?


>
I think I figured out a solution (haven't tried it yet).
>
Comments on this idea welcome (I'm kind of new to SQL):
>
The X12 files and GEDCOM files (maybe HL7, too?) have
multiple levels. Generally, each "level X" record
may own more than one record on level X+1
>
So if a file has (data elem delims changed to spaces)
...
CLP A B C
SVC X Y Z
SVC 1 2 3
CLP D E F
SVC P Q R
SVC 5 6 7
...
then the first pass through the file could create rows
>
... A B C X Y Z ...
... A B C 1 2 3 ...
... D E F P Q R ...
... D E F 5 6 7 ...
>
Next, one query could SELECT DISTINCT to give
>
... A B C
... D E F
>
while another could SELECT for
>
... A X Y Z ...
... A 1 2 3 ...
... D P Q R ...
... D 5 6 7 ...
>
and the same strategy could be used on each adjacent pair of levels.
>
Right ?
>
--
Wes Groleau
>
He that is good for making excuses, is seldom good for anything else.
-- Benjamin Franklin

|||Dan Guzman wrote:

Quote:

Originally Posted by

It's true that you can transform EDI and GEDCOM files directly into
relational format. I think the reason XML is commonly used as an
intermediate format is that XML is perfect for hierarchical data and you
can leverage a high-performance XML import utility like SQLXML without
writing additional code. Although it will take a while, I expect XML
will eventually replace both EDI and GEDCOM formats. You'll be a step
ahead if you can process XML too.


OK, I do know how to read and write XML. But can an XML file
be formatted so that the utility will create multiple tables
with the appropriate foreign keys to relate them?

I got the impression when I was reading about it that one
XML file makes one table and vice versa.

--
Wes Groleau
----

"Thinking I'm dumb gives people something to
feel smug about. Why should I disillusion them?"
-- Charles Wallace
(in _A_Wrinkle_In_Time_)|||Wes Groleau wrote:

Quote:

Originally Posted by

OK, I do know how to read and write XML. But can an XML file
be formatted so that the utility will create multiple tables
with the appropriate foreign keys to relate them?
>
I got the impression when I was reading about it that one
XML file makes one table and vice versa.


The impression was wrong. I studied the MS KB article cited
earlier, and I can easily make such XML files. Only, the process
of transforming the file into XML is similar to the process used
by bulk load to turn the XML into records. So I suspect it would
add a little speed if I went directly to records.

--
Wes Groleau

Words of the Wild Wes(t) = http://ideas.lang-learn.us/WWWsqlsql

No comments:

Post a Comment