libxml_helper: a more delightful approach to parsing XML in Ruby
Libxml-Ruby is a blindingly fast way to parse XML in Ruby. However, many have complained that the interface is verbose and not especially Rubyish, and that the documentation lacks details and examples.
Thanks to the flexibility of Ruby, it’s easy to remedy this situation.
libxml_helper.rb, described here, opens up the XML::Node class from Libxml and adds helper functions that make it easier to use. The helpers also make it easier to use xpath with nodes that include default namespaces. (I intend to continue adding to the helpers to smooth rough edges as I encounter them.)
The interface was inspired by the fantastic HPricot, a “fast and delightful HTML parser”, and many HPricot examples will work unmodified.
Convenience functions
You can call to_libxml_doc on any string to convert it into an XML::Document:
>> s = '<foo><author>p. bogle</author><bar>content</bar><bar>cont2</bar></foo>' >> root = s.to_libxml_doc.root
The at() method returns the first Node matching the given xpath:
>> root.at("author")
=> <author>p. bogle</author>
The search() method returns a list of Nodes matching the given xpath:
>> root.search("bar")
=> [<bar>content</bar>, <bar>content2</bar>]
search() can also be called with a block to iterate through each of the matching nodes:
>> root.search("bar") do |bar| puts bar.xpath; end
/foo/bar[1]
/foo/bar[2]
The helper also improves the handling of default namespaces…
1 Comment so far
Leave a comment
Hi,
Can you help me out please? I can’t seem to make this work? I copy and pasted this file under the lib folder of my application but when I try to execute the following
>> s = ‘p. boglecontentcont2′
>> root = s.to_xml_doc.root
I get an error telling me that to_xml_doc is undefined. Where is it defined anyway? I didn’t see it in the libxml_helper code.
I also tried using to_libxml_doc but I can’t make this work also.
By Edge on 07.09.09 4:59 pm
Leave a comment