Hpricot is a very flexible HTML parser, based on Tanaka Akira's HTree 
and John Resig's JQuery, but with the scanner recoded in 
C (using Ragel for scanning.)

WWW: http://code.whytheluckystiff.net/hpricot/
