Development/Libraries/Java

html-parser: A JavaCC grammar for parsing HTML documents

Name:html-parser Vendor:JPackage Project
Version:1.02 License:GPL
Release:2jpp URL:http://www.quiotix.com/downloads/html-parser/
Summary
html-parser is a JavaCC grammar for parsing HTML documents. It does not enforce the DTD, but instead builds a simple parse tree which can be used to validate, reformat, display, analyze, or edit the HTML document. The goal was to produce a parse tree which threw away very little information contained in the source file, so that by dumping the parse tree, an almost identical copy of the input document would result. The only source information discarded by the parser is whitespace inside of tags (i.e., the spaces or newlines between the attributes of a tag.) It is not confused by things that look like tags inside of quoted strings. The generated parse tree supports the commonly used "Visitor" design pattern. Several visitor classes are provided, which do things like dump the parse tree, restructure the parse tree, etc. Common tasks such as formatting, validation, or analysis are easily performed as Visitors.

Arch: noarch

Download:html-parser-1.02-2jpp.noarch.rpm
Build Date:Wed Aug 25 01:12:39 2004
Packager:Ralph Apel <r.apel@r-apel.de>
Size:62 KiB

Arch: noarch

Download:html-parser-1.02-2jpp.src.rpm
Build Date:Wed Aug 25 01:12:39 2004
Packager:Ralph Apel <r.apel@r-apel.de>
Size:43 KiB

Changelog

* Wed Aug 25 12:00:00 2004 Ralph Apel <r.apel at r-apel.de> 1.02-2jpp
- Build with ant-1.6.2
* Fri Mar 28 11:00:00 2003 Nicolas Mailhot <Nicolas.Mailhot (at) JPackage.org> 1.02-1jpp
- Initial build.

Listing created by RepoView-0.4.1