Todo List Core: - Finish table and shorthand CSS attributes - border-collapse, caption-side, empty-cells, table-layout, vertical-align - background (and friends) - list-style - Implement all non-essential attribute transforms - Microsoft Word HTML cleaning - Plugins for major CMSes - Rewrite *Definition and Config relationship, add various "levels" of cleaning - Support other character encodings out-of-the-box - Allow strict HTML 4.01, loose HTML 4.01 and strict XHTML 1.0 output Code issues: - Massive profiling, make it faster! - Make URI validation routines tighter (especially mailto) - Distinguish between different types of URIs, for instance, a mailto URI in IMG SRC is nonsensical - Rewrite table's child definition to be faster, smart, and regexp free - Silently drop content inbetween SCRIPT tags (can be generalized to allow specification of elements that, when detected as foreign, trigger removal of children, although unbalanced tags could wreck havoc (or at least delete the rest of the document). Enhancements: - Fixes for Firefox's inability to handle COL alignment props (Bug 915) - Pretty-printing HTML - Hooks for adding custom processors to custom namespaced tags and attributes, offer default implementation - Auto-paragraphing (be sure to leverage fact that we know when things shouldn't be paragraphed, such as lists and tables).