Wednesday, January 1, 2014

XML vs HL7

It is FAQ Wednesday, when I take a FAQ off the pile and address it.

Today's frequently asked question is "why do so many systems use HL7 instead of XML?"

This is a good question with many possible answers, but this is my executive summary: XML is easy for humans to read and HL7 is easy for computers to process.

Medical IT is often short on power and long on functionality, so it is natural to avoid expensive to process and to embrace easy to process, even at the cost of human legibility. In my experience the people who wonder at the lack of XML are not career medical IT professionals.

XML is a mark up language, a structured tagged text format which descends directly from SGML. It was intended as a platform-independant document storage format, but has become a kind of universal data exchange format.

HL7 is a line-oriented, record and field-based text format which is rather reminiscent of serial line-oriented message formats of yore, such as ASTM which was already familiar to clinical lab people from instrument interfaces.

XML makes more-or-less self-documenting "trees" which can be displayed natively by most browsers, or "visualized" with a little javascript magic: http://www.w3schools.com/xml/xml_to_html.asp There are lots of tools for working with XML and storing it.

In theory, XML is fault-intolerant: XML processing is supposed to halt at the first error encountered. This is not very robust, but in theory there should be no errors because you can write a format document type definition (DTD) which will allow people to make sure that they are sending and receiving data in exactly the way that you expect. If the XML document is made using a DTD and parsed with the same DTD, what could go wrong? And whoever created the data used a validator, such as http://www.w3schools.com/xml/xml_validator.asp on it before releasing it, right?

(In practice, I do not see very much strict adherence to document type definitions.)


HL7 makes nice, simple messages which can be easily processed by almost any programming language. I have written HL7 message processors in C, Perl, PHP, and BASIC.

So how do these formats look like side by side? Consider the following two samples:

HL7 Lab Result:
MSH|^~\&|LCS|LCA|LIS|TEST9999|199807311532||ORU^R01|3629|P|2.2
PID|2|2161348462|20809880170|1614614|20809880170^TESTPAT||19760924|M|||^^^^
00000-0000|||||||86427531^^^03|SSN# HERE
ORC|NW|8642753100012^LIS|20809880170^LCS||||||19980727000000|||HAVILAND
OBR|1|8642753100012^LIS|20809880170^LCS|008342^UPPER RESPIRATORY
CULTURE^L|||19980727175800||||||SS#634748641 CH14885 SRC:THROA
SRC:PENI|19980727000000||||||20809880170||19980730041800||BN|F

OBX|1|ST|008342^UPPER RESPIRATORY CULTURE^L||FINALREPORT|||||N|F||| 19980729160500|BN
ORC|NW|8642753100012^LIS|20809880170^LCS||||||19980727000000|||HAVILAND
OBR|2|8642753100012^LIS|20809880170^LCS|997602^.^L|||19980727175800||||G|||
19980727000000||||||20809880170||19980730041800|||F|997602|||008342

OBX|2|CE|997231^RESULT 1^L||M415|||||N|F|||19980729160500|BN
NTE|1|L|MORAXELLA (BRANHAMELLA) CATARRHALIS
NTE|2|L| HEAVY GROWTH
NTE|3|L| BETA LACTAMASE POSITIVE
OBX|3|CE|997232^RESULT 2^L||MR105|||||N|F|||19980729160500|BN
NTE|1|L|ROUTINE RESPIRATORY FLORA


(from http://www.corepointhealth.com/resource-center/hl7-resources/hl7-oru-message)

XML Lab Result:
<element name="lab-test-results">
        <complexType>
            <annotation>
                <documentation>
                    <summary>
                        A series of lab test results.
                    </summary>
                </documentation>
            </annotation>
            <sequence>
                <element name="when" type="d:approx-date-time" minOccurs="0">
                    <annotation>
                        <documentation>
                            <summary>
                                The date and time of the results.
                            </summary>
                        </documentation>
                    </annotation>
                </element>
                <element name="lab-group" type="lab:lab-test-results-group-type" maxOccurs="unbounded">
                    <annotation>
                        <documentation>
                            <summary>
                                    A set of lab results.
                            </summary>
                        </documentation>
                    </annotation>
                </element>
                <element name="ordered-by" type="t:Organization" minOccurs="0">
                    <annotation>
                        <documentation>
                            <summary>
                                    The person or organization that ordered the lab tests.
                            </summary>
                        </documentation>
                    </annotation>
                </element>
            </sequence>
        </complexType>
    </element>

 (from http://social.msdn.microsoft.com/Forums/en-US/5003cf00-de7f-41ec-93a9-c04b14e41837/xml-schema-of-lab-test-results)

No comments:

Post a Comment