Abstract: The paper presents extensible software architecture and a prototype and an implementation of a highly
configurable system for HTML validation. It is based on validation rules defined in an XML document called
“extended validation schema”. It serves as an extended validation schema beside the official HTML specification,
because the browsers’ and other web clients’ differences in HTML visualization makes the HTML specification
insufficient and it is perfectly possible an HTML document to be syntax valid and yet not well visualized in some
browser or mail-client. The extended validation schema allows definition of custom and specific validation rules in
three levels - document rules, element (or tag) rules and attributes rules. The correctness of the validation
schema is checked via a predefined XSD schema. The paper defines a prototype of a validation engine that
consists of HTML parser, HTML validator, Storage module and Statistics module. The HTML parser parses the
HTML file and breaks it into corresponding elements. The HTML validator applies the custom validations defined
in the extended validation schema for every single element and attribute along with document-level validations,
and also automatically corrects the errors wherever possible. The Storage module saves the validation results to
a persistent storage. They can be considered for unit tests and used by the Statistics module to create additional
statistics, analyses, quality assurance and bug tracking. A comparison is made with other HTML validation
services and solutions. The results of an implementation of the prototype system in a software company are also
presented.
Keywords: HTML validation, XML schema, quality assurance, unit tests, bugs tracking.
ACM Classification Keywords: D.4.m Software – Miscellaneous.
Link:
HTML VALIDATION THROUGH EXTENDED VALIDATION SCHEMA
Radoslav Radev
http://www.foibg.com/ijima/vol01/ijima01-3-p09.pdf