Fork me on GitHub

Ox

peter@ohler.com
Ohler home

Ox (Optimized XML) is a fast XML parser and Object marshaller as a Ruby gem.

As the name implies was written to provide speed optimized XML and now HTML handling. It was designed to be an alternative to Nokogiri and other Ruby XML parsers in generic XML parsing and as an alternative to Marshal for Object serialization.

Unlike some other Ruby XML parsers, Ox is self contained. Ox uses nothing other than standard C libraries so version issues with libXml are not an issue.

Marshal uses a binary format for serializing Objects. That binary format changes with releases making Marshal dumped Object incompatible between some versions. The use of a binary format make debugging message streams or file contents next to impossible unless the same version of Ruby and only Ruby is used for inspecting the serialize Object. Ox on the other hand uses human readable XML. Ox also includes options that allow strict, tolerant, or a mode that automatically defines missing classes.

It is possible to write an XML serialization gem with Nokogiri or other XML parsers but writing such a package in Ruby results in a module significantly slower than Marshal. This is what triggered the start of Ox development.

Ox handles XML documents in three ways. It is a generic XML parser and writer, a fast Object / XML marshaller, and a stream SAX parser. Ox was written for speed as a replacement for Nokogiri, Ruby LibXML, and for Marshal.

As an XML parser it is 2 or more times faster than Nokogiri and as a generic XML writer it is as much as 20 times faster than Nokogiri. Of course different files may result in slightly different times.

As an Object serializer Ox is up to 6 times faster than the standard Ruby Marshal.dump() and up to 3 times faster than Marshal.load().

The SAX like stream parser is 40 times faster than Nokogiri and more than 13 times faster than LibXML when validating a file with minimal Ruby callbacks. Unlike Nokogiri and LibXML, Ox can be tuned to use only the SAX callbacks that are of interest to the caller. (See the perf_sax.rb file for an example.)

Star Watch Vote on Hacker News Documentation

Please vote

April 24, 2017


Please vote on the desired default for how Ox skips whitespace. Check out issue Issue #171.