ClickAider

libxml_helper: a more delightful approach to parsing XML in Ruby

Libxml-Ruby is a blindingly fast way to parse XML in Ruby. However, many have complained that the interface is verbose and not especially Rubyish, and that the documentation lacks details and examples.

Thanks to the flexibility of Ruby, it’s easy to remedy this situation.

libxml_helper.rb, described here, opens up the XML::Node class from Libxml and adds helper functions that make it easier to use. The helpers also make it easier to use xpath with nodes that include default namespaces. (I intend to continue adding to the helpers to smooth rough edges as I encounter them.)

The interface was inspired by the fantastic HPricot, a “fast and delightful HTML parser”, and many HPricot examples will work unmodified.

Convenience functions

You can call to_xml_doc on any string to convert it into an XML::Document:

>> s = '<foo><author>p. bogle</author><bar>content</bar><bar>cont2</bar></foo>'
>> root = s.to_xml_doc.root

The at() method returns the first Node matching the given xpath:

>> root.at("author")
=> <author>p. bogle</author>

The search() method returns a list of Nodes matching the given xpath:

>> root.search("bar")
=> [<bar>content</bar>, <bar>content2</bar>]

search() can also be called with a block to iterate through each of the matching nodes:

>>  root.search("bar") do |bar| puts bar.xpath; end
/foo/bar[1]
/foo/bar[2]

The helper also improves the handling of default namespaces…

Read More

No Comments so far
Leave a comment


Leave a comment

(required)

(required)