mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2025-03-23 14:27:02 +00:00
Update docs, esp in context of soon to be added tag transforms.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@666 48356398-32a2-884e-a903-53898d9a118a
This commit is contained in:
parent
fbe2c25f8a
commit
a8db22dfff
@ -7,6 +7,7 @@ and it's up to you to provide it the proper information and proper context
|
|||||||
to be effective. Things to remember:
|
to be effective. Things to remember:
|
||||||
|
|
||||||
1. Character Encoding: UTF-8.
|
1. Character Encoding: UTF-8.
|
||||||
|
This segment will soon be obsoleted by enduser-utf8.html
|
||||||
Currently, the parser runs under the assumption that it is dealing
|
Currently, the parser runs under the assumption that it is dealing
|
||||||
with UTF-8. Not ISO-8859-1 or Windows-1252, UTF-8. And definitely not "no
|
with UTF-8. Not ISO-8859-1 or Windows-1252, UTF-8. And definitely not "no
|
||||||
character encoding explicitly stated" or UTF-7. If you're not using UTF-8 as
|
character encoding explicitly stated" or UTF-7. If you're not using UTF-8 as
|
||||||
@ -27,6 +28,7 @@ this may be configurable in the future. Do you want standards compliance?
|
|||||||
The doctype is a good place to start.
|
The doctype is a good place to start.
|
||||||
|
|
||||||
3. IDs
|
3. IDs
|
||||||
|
This segment is obsoleted by enduser-id.html
|
||||||
They need to be unique, but without some knowledge of the
|
They need to be unique, but without some knowledge of the
|
||||||
rest of the document, it's difficult to know what's unique. %Attr.IDBlacklist
|
rest of the document, it's difficult to know what's unique. %Attr.IDBlacklist
|
||||||
needs to be set: we may want to consider disallowing IDs by default to
|
needs to be set: we may want to consider disallowing IDs by default to
|
||||||
|
@ -14,15 +14,15 @@ Since configuration is dependant on context, internal classes require a
|
|||||||
configuration object to be passed as a parameter. (They also require a
|
configuration object to be passed as a parameter. (They also require a
|
||||||
Context object).
|
Context object).
|
||||||
|
|
||||||
In relation to HTMLDefinition and CSSDefinition, there is a special class
|
In relation to HTMLDefinition and CSSDefinition, there could be a special class
|
||||||
of directives that influence the *construction* of the Definition object.
|
of directives that influence the *construction* of the Definition object.
|
||||||
A standard call pattern would look like:
|
A theoretical call pattern would look like:
|
||||||
|
|
||||||
1. Client calls Config->getHTMLDefinition()
|
1. Client calls Config->getHTMLDefinition()
|
||||||
2. Config calls HTMLDefinition->createNew(this)
|
2. Config calls HTMLDefinition->createNew(this)
|
||||||
3. HTMLDefinition constructs itself with base configuration
|
3. HTMLDefinition constructs itself with base configuration
|
||||||
4. HTMLDefinition calls Config->get('HTMLDefinition')
|
4. HTMLDefinition calls Config->get('HTML')
|
||||||
5. Config returns array of directives that later construction
|
5. Config returns array of directives
|
||||||
6. HTMLDefinition performs operations and changes specified by directives
|
6. HTMLDefinition performs operations and changes specified by directives
|
||||||
7. HTMLPurifier returns constructed definition
|
7. HTMLPurifier returns constructed definition
|
||||||
8. Config caches definition so it doesn't have to be generated again
|
8. Config caches definition so it doesn't have to be generated again
|
||||||
@ -33,3 +33,7 @@ custom copy, which OVERRIDES all directives. Only the base, vanilla copy
|
|||||||
is the Singleton, the object actually interfaced with is a operated-upon
|
is the Singleton, the object actually interfaced with is a operated-upon
|
||||||
clone of that object. Also, if an update to the directives would update
|
clone of that object. Also, if an update to the directives would update
|
||||||
the definition, you'd have to force reconstruction.
|
the definition, you'd have to force reconstruction.
|
||||||
|
|
||||||
|
In practice, the pulling directives from the config object are
|
||||||
|
solely need-based, and the flex points are littered throughout the
|
||||||
|
setup() function. Some sort of refactoring is likely in order.
|
||||||
|
@ -15,7 +15,10 @@ and properties to allow. HTMLDefinition makes a big part of what HTMLPurifier
|
|||||||
is.
|
is.
|
||||||
|
|
||||||
The idea, then, is to setup fundamentally different set of definitions, which
|
The idea, then, is to setup fundamentally different set of definitions, which
|
||||||
can further be customized using simpler configuration options.
|
can further be customized using simpler configuration options. Alternatively,
|
||||||
|
they could be implemented as configuration profiles, which simply load
|
||||||
|
a set of recommended directives to acheive a desired affect (no simpler
|
||||||
|
config options though).
|
||||||
|
|
||||||
Here are some fuzzy levels you could set:
|
Here are some fuzzy levels you could set:
|
||||||
|
|
||||||
|
@ -2,8 +2,8 @@
|
|||||||
Is HTML Purifier Strict or Transitional?
|
Is HTML Purifier Strict or Transitional?
|
||||||
A little bit of helpful guidance
|
A little bit of helpful guidance
|
||||||
|
|
||||||
Despite the fact that HTML Purifier professes only to support transitional
|
Despite the fact that HTML Purifier professes to support both transitional and
|
||||||
HTML, it rejects a lot of attributes and elements that are actually, indeed,
|
strict HTML, it rejects a lot of attributes and elements that are actually, indeed,
|
||||||
valid. You can investigate progress.html to find out precisely what we
|
valid. You can investigate progress.html to find out precisely what we
|
||||||
are doing to these *deprecated* attributes.
|
are doing to these *deprecated* attributes.
|
||||||
|
|
||||||
@ -11,8 +11,8 @@ However, users have found that Strict HTML imposes some quite unreasonable
|
|||||||
restrictions on certain things. The start and value attributes in ol and
|
restrictions on certain things. The start and value attributes in ol and
|
||||||
li (respectively) perhaps are the most contested. There's is currently no
|
li (respectively) perhaps are the most contested. There's is currently no
|
||||||
widely supported browser method short of JavaScript that can replace these
|
widely supported browser method short of JavaScript that can replace these
|
||||||
two deprecated elements. HTML Purifier does not currently support them, but
|
two deprecated elements. It behooves us to allow these deprecated
|
||||||
it might behoove us to do so while our output is still transitional.
|
attributes when the output is transitional.
|
||||||
|
|
||||||
Fortunantely, that's the only real bugger case. The others have near-perfect
|
Fortunantely, that's the only real bugger case. The others have near-perfect
|
||||||
CSS equivalents, and were presentational anyway. However, the other question
|
CSS equivalents, and were presentational anyway. However, the other question
|
||||||
@ -32,5 +32,6 @@ these loose-only constructs in loose mode:
|
|||||||
|
|
||||||
The changed child definitions as well as the ul.start li.value are the most
|
The changed child definitions as well as the ul.start li.value are the most
|
||||||
compelling reasons why loose should be used. We may want offer disabling <u>,
|
compelling reasons why loose should be used. We may want offer disabling <u>,
|
||||||
<strike> and <s> by themselves.
|
<strike> and <s> by themselves. We may also want to offer no pre-emptive
|
||||||
|
deprecated conversions. This all must be unified.
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user