mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2024-11-08 14:58:42 +00:00
58f00105c8
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1134 48356398-32a2-884e-a903-53898d9a118a
25 lines
1.1 KiB
Plaintext
25 lines
1.1 KiB
Plaintext
|
|
Web Hypertext Application Technology Working Group
|
|
WHATWG
|
|
|
|
== HTML 5 ==
|
|
|
|
URL: http://www.whatwg.org/specs/web-apps/current-work/
|
|
|
|
HTML 5 defines a kaboodle of new elements and attributes, as well as
|
|
some well-defined, "quirks mode" HTML parsing. Although WHATWG professes
|
|
to be targeted towards web applications, many of their semantic additions
|
|
would be quite useful in regular documents. Eventually, HTML
|
|
Purifier will need to audit their lists and figure out what changes need
|
|
to be made. This process is complicated by the fact that the WHATWG
|
|
doesn't buy into W3C's modularization of XHTML 1.1: we may need
|
|
to remodularize HTML 5 (probably done by section name). No sense in
|
|
committing ourselves till the spec stabilizes, though.
|
|
|
|
More immediately speaking though, however, is the well-defined parsing
|
|
behavior that HTML 5 adds. While I have little interest in writing
|
|
another DirectLex parser, other parsers like ph5p
|
|
<http://jero.net/lab/ph5p/> can be adapted to DOMLex to support much more
|
|
flexible HTML parsing (a cool feature I've seen is how they resolve
|
|
<b>bold<i>both</b>italic</i>).
|