HTML+CSS Validating Parser

Summary: Parses awful HTML and CSS
Author: Todd Blanchard
Owner: Todd Blanchard (tb)
Co-maintainers: <None>
Categories:
Homepage: http://www.squeaksource.com/htmlcssparser
PackageInfo name: HTML-CSS
RSS feed:

Description:

This is an HTML and CSS parser and DOM that handles rotten HTML and broken CSS quite well. I wrote it to provide validation of web pages and it is the underlying technology behind http://www.badpage.info. The tag nesting and attribute rules are determined by interpreting the DTD's at the W3C. Hopefully this will make it fairly future proof. The CSS parser understands most of CSS 2 and some CSS 3 and the CSS selectors can tell if they match a DOM node. There is no visual rendering and no calculation of layout.

I hearby license it free for almost any use with the understanding that it may not be used to provide website QA software or services such as might compete with http://badpage.info.

Otherwise, do whatever you like with it. I think it would make a dandy base for a real web browser. I also find it quite useful for scraping web pages.

Releases


Back