Hacker News new | past | comments | ask | show | jobs | submit login

From the README page:

  hxpipe (1) - convert XML to a format easier to parse with Perl or AWK
Being unfamiliar with either Perl or AWK, could anyone point me to an explanation/ example of why it is easier to parse/ what format it generates. Would it be easy to write a similar utility to say convert it to a Lua table?



The idea is that those utilities work in the UNIX way, which means that they are line-oriented.

The following two xml documents are equivalent:

    <a><b><c /></b><d>foo</d></a>
and

    <a>
      <b> <c /> </b>
      <d>foo</d>
    </a>
But to understand that using classical UNIX tools which are line-oriented is quite difficult, so you'll have a hard time doing operations such as "replace 'foo' by 'bar' if it appears as the textNode of a 'd' tag".

So the idea of hxpipe is that it is supposed to give you a line-oriented and similar representation of those two documents to work with.

But it actually fails to do that properly (at least for my taste). I largely prefer the output of xml2. Compare:

    # first doc, output of hxpipe
    (a
    (b
    |c
    )b
    (d
    -foo
    )d
    )a
    -\n

    # second doc, output of hxpipe
    (a
    -\n  
    (b
    - 
    |c
    - 
    )b
    -\n  
    (d
    -foo
    )d
    -\n
    )a
    -\n

    # output of xml2, for both documents
    /a/b/c
    /a/d=foo


Many thanks for the detailed reply. That makes a lot of sense.


Thanks! I've always created hokey awk scripts that split on <, but I really like that xml2 output!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: