From a8db22dfffceffa5520160a2f9760465e83d8736 Mon Sep 17 00:00:00 2001 From: "Edward Z. Yang" Date: Sat, 20 Jan 2007 03:59:07 +0000 Subject: [PATCH] Update docs, esp in context of soon to be added tag transforms. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@666 48356398-32a2-884e-a903-53898d9a118a --- docs/enduser-security.txt | 2 ++ docs/proposal-config.txt | 12 ++++++++---- docs/proposal-filter-levels.txt | 5 ++++- docs/ref-strictness.txt | 11 ++++++----- 4 files changed, 20 insertions(+), 10 deletions(-) diff --git a/docs/enduser-security.txt b/docs/enduser-security.txt index 695853d5..e7c9a8ce 100644 --- a/docs/enduser-security.txt +++ b/docs/enduser-security.txt @@ -7,6 +7,7 @@ and it's up to you to provide it the proper information and proper context to be effective. Things to remember: 1. Character Encoding: UTF-8. + This segment will soon be obsoleted by enduser-utf8.html Currently, the parser runs under the assumption that it is dealing with UTF-8. Not ISO-8859-1 or Windows-1252, UTF-8. And definitely not "no character encoding explicitly stated" or UTF-7. If you're not using UTF-8 as @@ -27,6 +28,7 @@ this may be configurable in the future. Do you want standards compliance? The doctype is a good place to start. 3. IDs + This segment is obsoleted by enduser-id.html They need to be unique, but without some knowledge of the rest of the document, it's difficult to know what's unique. %Attr.IDBlacklist needs to be set: we may want to consider disallowing IDs by default to diff --git a/docs/proposal-config.txt b/docs/proposal-config.txt index 0ac54c67..d291a3fb 100644 --- a/docs/proposal-config.txt +++ b/docs/proposal-config.txt @@ -14,15 +14,15 @@ Since configuration is dependant on context, internal classes require a configuration object to be passed as a parameter. (They also require a Context object). -In relation to HTMLDefinition and CSSDefinition, there is a special class +In relation to HTMLDefinition and CSSDefinition, there could be a special class of directives that influence the *construction* of the Definition object. -A standard call pattern would look like: +A theoretical call pattern would look like: 1. Client calls Config->getHTMLDefinition() 2. Config calls HTMLDefinition->createNew(this) 3. HTMLDefinition constructs itself with base configuration -4. HTMLDefinition calls Config->get('HTMLDefinition') -5. Config returns array of directives that later construction +4. HTMLDefinition calls Config->get('HTML') +5. Config returns array of directives 6. HTMLDefinition performs operations and changes specified by directives 7. HTMLPurifier returns constructed definition 8. Config caches definition so it doesn't have to be generated again @@ -33,3 +33,7 @@ custom copy, which OVERRIDES all directives. Only the base, vanilla copy is the Singleton, the object actually interfaced with is a operated-upon clone of that object. Also, if an update to the directives would update the definition, you'd have to force reconstruction. + +In practice, the pulling directives from the config object are +solely need-based, and the flex points are littered throughout the +setup() function. Some sort of refactoring is likely in order. diff --git a/docs/proposal-filter-levels.txt b/docs/proposal-filter-levels.txt index 83b3fced..a8306152 100644 --- a/docs/proposal-filter-levels.txt +++ b/docs/proposal-filter-levels.txt @@ -15,7 +15,10 @@ and properties to allow. HTMLDefinition makes a big part of what HTMLPurifier is. The idea, then, is to setup fundamentally different set of definitions, which -can further be customized using simpler configuration options. +can further be customized using simpler configuration options. Alternatively, +they could be implemented as configuration profiles, which simply load +a set of recommended directives to acheive a desired affect (no simpler +config options though). Here are some fuzzy levels you could set: diff --git a/docs/ref-strictness.txt b/docs/ref-strictness.txt index e383a29b..81907c1e 100644 --- a/docs/ref-strictness.txt +++ b/docs/ref-strictness.txt @@ -2,8 +2,8 @@ Is HTML Purifier Strict or Transitional? A little bit of helpful guidance -Despite the fact that HTML Purifier professes only to support transitional -HTML, it rejects a lot of attributes and elements that are actually, indeed, +Despite the fact that HTML Purifier professes to support both transitional and +strict HTML, it rejects a lot of attributes and elements that are actually, indeed, valid. You can investigate progress.html to find out precisely what we are doing to these *deprecated* attributes. @@ -11,8 +11,8 @@ However, users have found that Strict HTML imposes some quite unreasonable restrictions on certain things. The start and value attributes in ol and li (respectively) perhaps are the most contested. There's is currently no widely supported browser method short of JavaScript that can replace these -two deprecated elements. HTML Purifier does not currently support them, but -it might behoove us to do so while our output is still transitional. +two deprecated elements. It behooves us to allow these deprecated +attributes when the output is transitional. Fortunantely, that's the only real bugger case. The others have near-perfect CSS equivalents, and were presentational anyway. However, the other question @@ -32,5 +32,6 @@ these loose-only constructs in loose mode: The changed child definitions as well as the ul.start li.value are the most compelling reasons why loose should be used. We may want offer disabling , - and by themselves. + and by themselves. We may also want to offer no pre-emptive +deprecated conversions. This all must be unified.