mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2024-11-08 14:58:42 +00:00
Rename and rewrite content models docs.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1123 48356398-32a2-884e-a903-53898d9a118a
This commit is contained in:
parent
b442d09ea6
commit
2e089477a5
48
docs/ref-content-models.txt
Normal file
48
docs/ref-content-models.txt
Normal file
@ -0,0 +1,48 @@
|
|||||||
|
|
||||||
|
Handling Content Model Changes
|
||||||
|
|
||||||
|
|
||||||
|
1. Context
|
||||||
|
|
||||||
|
The distinction between Transitional and Strict document types is somewhat
|
||||||
|
of an anomaly in the lineage of XHTML document types (following 1.0, no
|
||||||
|
doctypes do not have flavors: instead, modularization is used to let
|
||||||
|
document authors vary their elements). This transition is usually quite
|
||||||
|
straight-forward, as W3C usually deprecates attributes or elements, which
|
||||||
|
are quite easily handled using tag and attribute transforms.
|
||||||
|
|
||||||
|
However, for two elements, <blockquote>, <body> and <address>, W3C elected
|
||||||
|
to also change the content model. <blockquote> and <body> originally
|
||||||
|
accepted both inline and block elements, but in the strict doctype they
|
||||||
|
only allow block elements. With <address>, the situation is inverted:
|
||||||
|
<p> tags were now forbidden from appearing within this tag.
|
||||||
|
|
||||||
|
|
||||||
|
2. Current situation
|
||||||
|
|
||||||
|
Currently, HTML Purifier treats <blockquote> specially during Tidy mode
|
||||||
|
using a custom ChildDef class StrictBlockquote. StrictBlockquote
|
||||||
|
operates similarly to Required, except that when it encounters an inline
|
||||||
|
element, it will wrap it in a block tag (as specified by
|
||||||
|
%HTML.BlockWrapper, the default is <p>). The naming suggests it can
|
||||||
|
only be used for <blockquote>s, although it may be possible to
|
||||||
|
genericize it to work on other cases of this nature (this would be of
|
||||||
|
little practical application, as no other element in XHTML 1.1 or earlier
|
||||||
|
has a block-only content model).
|
||||||
|
|
||||||
|
Tidy currently contains no custom, lenient implementation for <address>.
|
||||||
|
If one were to be written, it would likely operate on the principle that,
|
||||||
|
when a <p> tag were to be encountered, it would be replaced with a
|
||||||
|
leading and trailing <br /> tag (the contents of <p>, being inline, are
|
||||||
|
not an issue). There is no prior work with this sort of operation.
|
||||||
|
|
||||||
|
|
||||||
|
3. Outside applicability
|
||||||
|
|
||||||
|
There are a number of other elements that contain restrictive content
|
||||||
|
models, such as <ul> or <span> (the latter is restrictive in that it
|
||||||
|
does not allow block elements). In the former case, an errant node
|
||||||
|
is eliminated completely, in the latter case, the text of the node
|
||||||
|
would is preserved (as the parent node does allow PCDATA). Custom
|
||||||
|
content model implementations probably are not the best way of handling
|
||||||
|
these cases, instead, node bubbling should be implemented instead.
|
@ -1,18 +0,0 @@
|
|||||||
|
|
||||||
Loose versus Strict
|
|
||||||
[rename/deprecation pending]
|
|
||||||
|
|
||||||
The most common change between doctypes are between the two flavors of HTML 4.01 and
|
|
||||||
XHTML 1.0: Transitional (Loose) and Strict. Besides deprecated attributes and elements
|
|
||||||
(which are quite easy to identify), there are two content model changes that were
|
|
||||||
made:
|
|
||||||
|
|
||||||
BLOCKQUOTE changes from 'flow' to 'block'
|
|
||||||
current behavior: inline inner contents should not be nuked, block-ify as necessary
|
|
||||||
ADDRESS from potpourri to Inline (removes p tags)
|
|
||||||
current behavior: block tags silently dropped
|
|
||||||
ideal behavior: replace block elements with something like <br>. (not high priority,
|
|
||||||
somewhat difficult to implement)
|
|
||||||
|
|
||||||
We're missing strict support for U, S, STRIKE: this needs to be fixed soon (and
|
|
||||||
is quite simple to fix).
|
|
Loading…
Reference in New Issue
Block a user