Handling Content Model Changes 1. Context The distinction between Transitional and Strict document types is somewhat of an anomaly in the lineage of XHTML document types (following 1.0, no doctypes do not have flavors: instead, modularization is used to let document authors vary their elements). This transition is usually quite straight-forward, as W3C usually deprecates attributes or elements, which are quite easily handled using tag and attribute transforms. However, for two elements, <blockquote>, <body> and <address>, W3C elected to also change the content model. <blockquote> and <body> originally accepted both inline and block elements, but in the strict doctype they only allow block elements. With <address>, the situation is inverted: <p> tags were now forbidden from appearing within this tag. 2. Current situation Currently, HTML Purifier treats <blockquote> specially during Tidy mode using a custom ChildDef class StrictBlockquote. StrictBlockquote operates similarly to Required, except that when it encounters an inline element, it will wrap it in a block tag (as specified by %HTML.BlockWrapper, the default is <p>). The naming suggests it can only be used for <blockquote>s, although it may be possible to genericize it to work on other cases of this nature (this would be of little practical application, as no other element in XHTML 1.1 or earlier has a block-only content model). Tidy currently contains no custom, lenient implementation for <address>. If one were to be written, it would likely operate on the principle that, when a <p> tag were to be encountered, it would be replaced with a leading and trailing <br /> tag (the contents of <p>, being inline, are not an issue). There is no prior work with this sort of operation. 3. Outside applicability There are a number of other elements that contain restrictive content models, such as <ul> or <span> (the latter is restrictive in that it does not allow block elements). In the former case, an errant node is eliminated completely, in the latter case, the text of the node would is preserved (as the parent node does allow PCDATA). Custom content model implementations probably are not the best way of handling these cases, instead, node bubbling should be implemented instead. vim: et sw=4 sts=4