mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2025-03-23 14:27:02 +00:00
Update documentation.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@418 48356398-32a2-884e-a903-53898d9a118a
This commit is contained in:
parent
e440f25bce
commit
b5c69d8ca5
@ -24,8 +24,7 @@ AttrDef
|
|||||||
Number - constructor interface is inconsistent with Integer
|
Number - constructor interface is inconsistent with Integer
|
||||||
AttrTransform - doesn't accept AttrContext, non-validating
|
AttrTransform - doesn't accept AttrContext, non-validating
|
||||||
ChildDef - not-allowed nodes translated to text, likely invalid handling
|
ChildDef - not-allowed nodes translated to text, likely invalid handling
|
||||||
Config - "load configuration" hooks missing, rich set* accessors missing,
|
Config - "load configuration" hooks missing, rich set* accessors missing
|
||||||
needs redefined relationship with the definitions
|
|
||||||
Strategy
|
Strategy
|
||||||
FixNesting - cannot bubble nodes out of structures
|
FixNesting - cannot bubble nodes out of structures
|
||||||
MakeWellFormed - insufficient automatic closing definitions (check HTML
|
MakeWellFormed - insufficient automatic closing definitions (check HTML
|
||||||
|
@ -17,18 +17,9 @@ are passed. These classes are: HTMLPurifier::*, Generator::generateFromTokens
|
|||||||
and Lexer::tokenizeHTML. However, whenever a valid configuration object
|
and Lexer::tokenizeHTML. However, whenever a valid configuration object
|
||||||
is defined, that object should be used.
|
is defined, that object should be used.
|
||||||
|
|
||||||
-- the following is projected changes to the configuration system --
|
In relation to HTMLDefinition and CSSDefinition, there is a special class
|
||||||
|
of directives that influence the *construction* of the Definition object.
|
||||||
In relation to HTMLDefinition and CSSDefinition, there are going to be some
|
A standard call pattern would look like:
|
||||||
major structural changes to enable the easy configuration of these objects.
|
|
||||||
Due to the intricacy of these objects, it's not feasible to ask an average
|
|
||||||
user to twiddle around with an element and its 20 other dependencies. However,
|
|
||||||
these objects are the only possible point where change could occur in the
|
|
||||||
context of configuration.
|
|
||||||
|
|
||||||
The solution is to introduce a special class of directives that influence the
|
|
||||||
*construction* of the Definition object. A standard call pattern would look
|
|
||||||
like:
|
|
||||||
|
|
||||||
1. Client calls Config->getHTMLDefinition()
|
1. Client calls Config->getHTMLDefinition()
|
||||||
2. Config calls HTMLDefinition->createNew(this)
|
2. Config calls HTMLDefinition->createNew(this)
|
||||||
|
@ -4,7 +4,7 @@ Optimization
|
|||||||
Here are some possible optimization techniques we can apply to code sections if
|
Here are some possible optimization techniques we can apply to code sections if
|
||||||
they turn out to be slow. Be sure not to prematurely optimize though!
|
they turn out to be slow. Be sure not to prematurely optimize though!
|
||||||
|
|
||||||
- Make Tokens Flyweights
|
- Make Tokens Flyweights (may prove problematic, probably not worth it)
|
||||||
- Rewrite regexps into PHP code
|
- Rewrite regexps into PHP code
|
||||||
- Serialize the Definition object
|
- Serialize the Definition object
|
||||||
- Batch regexp validation (do as many per function call as possible)
|
- Batch regexp validation (do as many per function call as possible)
|
||||||
|
@ -12,8 +12,6 @@ character encoding explicitly stated" or UTF-7. If you're not using UTF-8 as
|
|||||||
your character encoding, you should switch. Now. Make sure any input is
|
your character encoding, you should switch. Now. Make sure any input is
|
||||||
properly converted to UTF-8, or the parser will mangle it badly
|
properly converted to UTF-8, or the parser will mangle it badly
|
||||||
(though it won't be a security risk if you're outputting it as UTF-8 though).
|
(though it won't be a security risk if you're outputting it as UTF-8 though).
|
||||||
We will be adding out-of-the-box support for the other major character
|
|
||||||
encodings shortly.
|
|
||||||
|
|
||||||
2. XHTML 1.0 Transitional. This is what the parser is outputting. For the most
|
2. XHTML 1.0 Transitional. This is what the parser is outputting. For the most
|
||||||
part, it's compatible with HTML 4.01, but XHTML enforces some very nice things
|
part, it's compatible with HTML 4.01, but XHTML enforces some very nice things
|
||||||
@ -37,4 +35,5 @@ to protect your pages from being attacked by garish colors and plain old
|
|||||||
bad taste. A neat feature would be the ability to define acceptable colors
|
bad taste. A neat feature would be the ability to define acceptable colors
|
||||||
in a document, but that's not likely to be implemented for a while. In the
|
in a document, but that's not likely to be implemented for a while. In the
|
||||||
meantime, be sure to make sure that floated elements (permitted, since they
|
meantime, be sure to make sure that floated elements (permitted, since they
|
||||||
can be quite useful) can't mess up your layout.
|
can be quite useful) can't mess up your layout. Once again, we may want to
|
||||||
|
disable this by default to protect lazy developers.
|
||||||
|
@ -54,4 +54,4 @@ HTML Purifier is best suited for documents that require a rich array of
|
|||||||
HTML tags. Things like blog comments are, in all likelihood, most appropriately
|
HTML tags. Things like blog comments are, in all likelihood, most appropriately
|
||||||
written in an extremely restrictive set of markup that doesn't require
|
written in an extremely restrictive set of markup that doesn't require
|
||||||
all this functionality (or not written in HTML at all), although this may
|
all this functionality (or not written in HTML at all), although this may
|
||||||
be changing in the future.
|
be changing in the future with the addition of levels of filtering.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user