Dances with XML

XML. It seems that most developers either love it or hate it. However you feel about it, XML is wide-spread enough that every Delphi developer is likely to reckon with it some time during their career.

Delphi provides 3 processor implementations (called DOM vendors) to choose from, all housed under a common API (units xmldom, XMLIntf and XMLDoc. The 3 stock vendors are:

  • Xerces XML;
  • ADOM XML v4; and
  • MSXML.

Xerces works on Win32 and Linux targets.ADOM is an all-native code solution so it works on all platforms. MSXML is the default, and naturally it works only on Win32. I don’t know if Xerces or MSXML will work on Win64. Perhaps a reader can advise? For the rest of this entry, I am going to assume that the reader is/(will be) using the IXMLDocument with the MSXML vendor, targeting the Win32 platform.

Reams of code

Any one who has used Delphi’s stock library for navigating through XML documents will be struck by one disconcerting observation: With our expectations spoiled by ease at which we Delphi programmers can manipulate strings, records, objects, interfaces and our own cleverly defined library API’s, it takes a disproportionately large amount of code to achieve only small things in terms of XML navigation. And if you want robust code that handles missing or unexpected structures in your XML document, OMG, it’s so much code to write! I am not going to provide demonstration code to prove my point – I leave that as an exercise to the reader.

At this point, some readers will be thinking “Use a library with fluent API!”. Yes, this helps to a degree, but not enough. And to date, there is no FOSS or commercial publicly available Delphi XML processor library with a fluent API, that can both read and write to the document. With the stock libraries, navigation of XML documents is like wrestling a crocodile. It shouldn’t be like that. It should be like dancing on air. And it can be. The answer is XPATH.

XPATH is a simple language to describe the selection of sequences of nodes within an XML document, with respect to a focal point and an XPATH expression. XPATH is to XML what SQL is to relational databases. With a little help from a small library unit, we can leverage the power of XPATH within Delphi and make the experience of navigating documents robustly, like dancing on air. With the MSXML vendor, we are sadly limited to XPATH 1.0 – no XPATH 2.0 available from Microsoft. But XPATH 1.0 should be enough for most needs. The remainder of this entry will be a series of small problems and a demonstration of how simple it is to solve with XPATH.

XPath Axises

Use case 1: “Hide the Sausage”

<places>
 <!-- Where is that sausage hiding? -->
 <under-the-bed />
 <in-the-closet>
  <sausage snag-count="3" />
 </in-the-closet>
 <sofa />
 <larder>
  <sausage snag-count="1" />
 </larder>
 <laundry-room />
</places>

In the following sample problems we have declared:

var
  Root: IXMLNode;  // The document node for the above XML document loaded into IXMLDocument.
  Run : IXMLNode;  // Just a loop variable.

Problem 1:

Find all the hiding places for sausages!

Solution 1:

for Run in XFocus(Root) / 'places/*[sausage]' do
  PutNode( Run)

Problem 2:

Is there any place with exactly one sausage?

Solution 2:

if '*/*/sausage[@snag-count=1]' in XFocus(Root) then
    Put(' Yes, there is.')
  else
    Put(' No, there isn''t.')

Problem 3:

How many sausages are there in the house? Don’t assume the limited structure in Use Case 1. The document has an arbitary structure and those sausages can be hiding any-where at any level in the document.

Solution 3:

Sum := 0;
for Run in XFocus(Root) / '//sausage/@snag-count' do
  Inc( Sum, StrToIntDef( Run.Text, 0))

Note, although valid XPATH, we can’t do the following …

Sum := 0;
for Node in XFocus( Root) / ′sum(//Sausage/@snag-count)′ do
  Sum := Node.Value

.. because of a limitation imposed by our current vendor (MSXML) that we can only use XPATH that returns node-sets, not atomic values.

Problem 4:

Where are those hiding places again? Produce a HTML list sorted by sausage count ascending.

Solution 4:

Whilst not technically challenging, solving this using IXMLDocument might take a page or two code. With a little help from a good friend of XPATH, namely XSLT, this looks line a one liner …

Put( (XFocus(Root) * ppTransform.Content).XML.Text)

…well…, not quiet. The ppTransform.Content mentioned above is this string. Its an XSLT transform. But all-in-all its not much code for what it does.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes" />

<xsl:template match="/">
 <ul>
   <xsl:apply-templates select="*/*[sausage]">
     <xsl:sort select="sum( sausage/@snag-count)"  data-type="number" order="ascending" />
   </xsl:apply-templates>
 </ul>
</xsl:template>

<xsl:template match="*">
 <li>
  <xsl:value-of select="name()"/>
 </li>
</xsl:template>
</xsl:stylesheet>

Summary

If from these samples you are not impressed by the power and simplicity of XPATH in Delphi to solve navigational problems, then this is probably a picture of you. (Below left).

 

 

Posted in Language, Misc., Windows
15 comments on “Dances with XML
  1. jpluimers says:

    This is exactly the reason I am almost done preparing an XML session for the Delphi Tage event in September in Germany (:

  2. jpluimers says:

    There is already a bunch of demo stuff up here: http://bo.codeplex.com/SourceControl/changeset/view/80654#1321776 including an XPath demo that I just retro-ported to Delphi 2007.

  3. Cool picture of the baby.

  4. jpluimers says:

    BTW: Why is the order of parameters for “in” different from “IntDivide” and “Divide”?

    • DaddyHPriest says:

      The order is different to bring closer alignment with natural English. Take a look at Solution 2. Which syntax would read more natural to you: As I have shown in Solution 2? Or the same but with the operators reversed? We are asking if some condition is present in a node. Of course there is plenty of room for divergent opinions here.

  5. Some people, when presented with a problem, think “I know, I’ll use XML.”

    <Problem:Worsening> <Problem:TimeDescription>Now</Problem:TimeDescription> <Problem:Posessive>they have</Problem:Posessive> <Problem:Quantity>many, many</Problem:Quantity> <Problem:WorseningDescription>more problems</Problem:WorseningDescription></ProblemWorsening>
    • turbu says:

      Argh! Everything *almost* posted right, but somehow it ate the linebreaks.
      Trying again:

      <Problem:Worsening>
        <Problem:TimeDescription>Now</Problem:TimeDescription>
        <Problem:Posessive>they have</Problem:Posessive>
        <Problem:Quantity>many, many</Problem:Quantity>
        <Problem:WorseningDescription>more problems</Problem:WorseningDescription>
      </ProblemWorsening>
  6. Iztok Kacin says:

    XML is a pain to use. So much boilerplate code. That is why I wrote SimpleStorage, sem fluent but very powerful set of interfaces on top of OmniXML.

    You can find it here:

    http://www.cromis.net/blog/downloads/simplestorage/

    There are also a lot of examples on my blog. It cuts down the code by as much as 80% I would say.

  7. Thanks for this article! I like the idea of the XFocus. However, there seems something wrong with the compiler compiling the operator Implicit: IInterface –> Record:

    type
      IMyInterface = interface
      end;
    
      TMyObject = class(TInterfacedObject, IMyInterface)
      end;
    
      TRecord = record
      strict private
        I: IMyInterface;
    
      public
        class operator Implicit(const Value: IMyInterface): TRecord;
      end;
    
    class operator TRecord.Implicit(const Value: IMyInterface): TRecord;
    begin
      Result.I := Value;
    end;
    
    type
      TRecTest = class(TTestCase)
      published
        procedure Test;
      end;
    
      { TRecTest }
    
    procedure TRecTest.Test;
    var
      O: TInterfacedObject;
      R: TRecord;
      I: IMyInterface;
    begin
      I := TMyObject.Create; // (I.RefCount: 1)
      O := I as TMyObject; // O.RefCount: 1;
      R := I; // O.RefCount: 3; (!)
    end;

    It took me quite some time to figure out where the reference counting goes wrong. It seems the compiler doesn’t implement the Implicit operator properly when it takes some interface reference and outputs the record containing this interface reference. The reference count raises by 2! When the same functionality is implemented as a method then reference counting is done properly.

    I really don’t know how the lifetime of IXMLNode’s underlying object is managed and how does this affect your very nice code from the article?

    • DaddyHPriest says:

      Martin, This sort of question best belongs on StackOverflow. There is no error in the compiler. The code behaves as designed. It is correct that the reference count reaches 3. Here are the 3 holders: (1) I ; (2) The record; and (3) a temporary implicit pointer living on the call stack. All 3 references go away when the Test() method exits.

  8. jpluimers says:

    @Martin: I think you are bumping into an issue similar to this:

    http://stackoverflow.com/questions/4509015/should-the-compiler-hint-warn-when-passing-object-instances-directly-as-const-in

    Should the compiler hint/warn when passing object instances directly as const interface parameters?

  9. shane van says:

    I just spent a month doing parsing of XML in SQL Server. What really, really impressed me was how absolutely shocking the complexity of XML was in SQL Server. Despite all the hype, the notation was attrocious. Admitadly the XML wasn’t well structured, but since it was out of my control, I had to deal wth it. There is absolutely no way I could even get close to the simplicity of some of the stuff shown here. So I will actually consider using XML again in my next project. Still not a fan of XML, but this makes it a bit more useable.

  10. ncook says:

    I love the novel use of the operator overloading in the XFocus record. That’s really thinking outside the box. However I’m not sure what the purpose of the ‘IntDivide’ overload is. Unless I’m missing something, there does not seem to be an example that uses it.

2 Pings/Trackbacks for "Dances with XML"
  1. […] See my previous blog entry Dances with XML […]

  2. […] Combining the rules of operator and result types, you can do magical things like Dances with XML | Australian Delphi User Group Members. […]

2015 Symposium

Registrations are now open!

Take advantage of the Early Bird Special and register before March 1.
Not an ADUG member? Your subscription is FREE for the first year if you join and register for the symposium together!

Melbourne - March 26

Members  —  Non-Members

Canberra - March 27

Members  —  Non-Members

Full details on the Symposium page

2014 Symposium Sponsors

The ADUG wishes to acknowledge the assistance of the following companies whose sponsorship helps to make our Symposium possible:

Elevate Software Arena Business Technology Help and Manua Nexus DB Raize Software TCG-Logo

Archives