Update txt docs.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1134 48356398-32a2-884e-a903-53898d9a118a
2024-12-22 16:31:53 +00:00 · 2007-06-09 14:53:21 +00:00 · 2007-06-09 14:53:21 +00:00 · 58f00105c8
commit 58f00105c8
parent 8d15d1ce13
6 changed files with 42 additions and 92 deletions
--- a/docs/enduser-security.txt
+++ b/docs/enduser-security.txt
@ -8,15 +8,11 @@ to be effective. Things to remember:
 1. Character Encoding: see enduser-utf8.html for more info.
-2. Doctype: document pending feature completion
+2. IDs: see enduser-id.html for more info
 Not strictly necessary, actually. More in-depth discussion once we figure
 out how to get strict loose mode working.
-3. IDs: see enduser-id.html for more info
+3. Links: document pending feature completion
 4. Links: document pending feature completion
 Rudimentary blacklisting, we should also allow only relative URIs. We
 need a doc to explain the stuff.
-5. CSS: document pending
+4. CSS: document pending
 Explain which CSS styles we blocked and why.
--- a/docs/index.html
+++ b/docs/index.html
@ -141,12 +141,6 @@ the code. They may be upgraded to HTML files or stay as TXT scratchpads.</p>
    <td>List of vendor-specific tags we may want to transform to W3C compliant markup.</td>
 </tr>
 <tr>
    <td>Reference</td>
    <td><a href="ref-strictness.txt">Strictness</a></td>
    <td>Short essay on how loose definition isn't really loose.</td>
 </tr>
 <tr>
    <td>Reference</td>
    <td><a href="ref-html-modularization.txt">Modularization of HTMLDefinition</a></td>
--- a/docs/proposal-config.txt
+++ b/docs/proposal-config.txt
@ -1,6 +1,5 @@
 Configuration
    [needs updating]
 Configuration is documented on a per-use case: if a class uses a certain
 value from the configuration object, it has to define its name and what the
@ -13,29 +12,10 @@ the documentation in ConfigDef for more information on these namespaces.
 Since configuration is dependant on context, internal classes require a
 configuration object to be passed as a parameter.  (They also require a
-Context object).
+Context object). A majority of classes do not need the config object,
 but for those who do, it is a lifesaver.
-In relation to HTMLDefinition and CSSDefinition, there could be a special class
+Definition objects are complex datatypes influenced by their respective
-of directives that influence the *construction* of the Definition object.
+directive namespaces (HTMLDefinition with HTML and CSSDefinition with CSS).
-A theoretical call pattern would look like:
+If any of these directives is updated, HTML Purifier forces the definition
-
+to be regenerated.
 1. Client calls Config->getHTMLDefinition()
 2. Config calls HTMLDefinition->createNew(this)
 3. HTMLDefinition constructs itself with base configuration
 4. HTMLDefinition calls Config->get('HTML')
 5. Config returns array of directives
 6. HTMLDefinition performs operations and changes specified by directives
 7. HTMLPurifier returns constructed definition
 8. Config caches definition so it doesn't have to be generated again
 9. Config returns definition
 You could also override Config's copy of the definition with your own
 custom copy, which OVERRIDES all directives.  Only the base, vanilla copy
 is the Singleton, the object actually interfaced with is a operated-upon
 clone of that object.  Also, if an update to the directives would update
 the definition, you'd have to force reconstruction.
 In practice, the pulling directives from the config object are
 solely need-based, and the flex points are littered throughout the
 setup() function.  Some sort of refactoring is likely in order. See
 ref-xhtml-1.1.txt for more info.
--- a/docs/proposal-filter-levels.txt
+++ b/docs/proposal-filter-levels.txt
@ -2,23 +2,16 @@
 Filter Levels
    When one size *does not* fit all
-The more I think about it, the less sense it makes for maintaining one huge
+It makes little sense to constrain users to one set of HTML elements and
-monolithic HTMLDefinition class.  There's simply so much variation that
+attributes and tell them that they are not allowed to mold this in
-could go into this definition: the set of HTML good for blog entries is
+any fashion.  Many users demand to be able to custom-select which elements
-definitely too large for HTML that would be allowed in blog comments. Going
+and attributes they want.  This is fine: because HTML Purifier keeps close
-from Transitional to Strict requires changes to the definition.
+track of what elements are safe to use, there is no way for them to
 accidently allow an XSS-able tag.
-Allowing users to specify their own whitelists is one step (implemented, btw), 
+However, combing through the HTML spec to make your own whitelist can
-but I have doubts on only doing this. Simply put, the typical programmer is too 
+be a daunting task.  HTML Purifier ought to offer pre-canned filter levels
-lazy to actually go through the trouble of investigating which tags, attributes 
+that amateur users can select based on what they think is their use-case.
 and properties to allow. HTMLDefinition makes a big part of what HTMLPurifier 
 is. 
 The idea, then, is to setup fundamentally different set of definitions, which
 can further be customized using simpler configuration options.  Alternatively,
 they could be implemented as configuration profiles, which simply load
 a set of recommended directives to acheive a desired affect (no simpler
 config options though).
 Here are some fuzzy levels you could set:
@ -46,6 +39,10 @@ make forbidden element to text transformations desirable (for example, images).
 == Element Risk Analysis ==
 Although none of the currently supported elements presents a security
 threat per-say, some can cause problems for page layouts or be
 extremely complicated.
 Legend:
    [danger level] - regular tags / uncommon tags ~ deprecated tags
    [danger level]* - rare tags
@ -130,6 +127,7 @@ any CSS properties that are not currently implemented (such as position).
 Dangerous, can go outside container - float
 Easy to abuse - font-size, font-family (font), width
 Colored - background-color (background), border-color (border), color
    (see proposal-colors.html)
 Dramatic - border, list-style-position (list-style), margin, padding,
    text-align, text-indent, text-transform, vertical-align, line-height
--- a/docs/ref-strictness.txt
+++ b/docs/ref-strictness.txt
@ -1,33 +0,0 @@
 Is HTML Purifier Strict or Transitional?
    [rename/deprecation pending]
 Despite the fact that HTML Purifier professes to support both transitional and
 strict HTML, it rejects a lot of attributes and elements that are actually, indeed,
 valid. You can investigate progress.html to find out precisely what we
 are doing to these *deprecated* attributes.
 However, users have found that Strict HTML imposes some quite unreasonable
 restrictions on certain things. The start and value attributes in ol and
 li (respectively) perhaps are the most contested. There's is currently no
 widely supported browser method short of JavaScript that can replace these
 two deprecated elements. It behooves us to allow these deprecated
 attributes when the output is transitional.
 Fortunantely, that's the only real bugger case. The others have near-perfect
 CSS equivalents, and were presentational anyway. However, the other question
 pops up: should we always convert these to the CSS forms when 1. the spec
 allows them anyway and 2. older browsers support them better? After all, the
 whole point about CSS is to seperate styling from content, so inline styling
 doesn't solve that problem.
 [new material]
 HTML Purifier 1.7 creates a new organizational system for deprecated attribute/
 element transformations. They will be unified under the title of "Tidy", which
 is what they are: cleaning up after deprecated user markup into standards-compliant
 versions. There will also be a change in the default behavior (athough, to the
 end user not inspecting the HTML, there will be no change: in fact, it may
 work even better).
 Consult the Advanced API for more details.
--- a/docs/ref-whatwg.txt
+++ b/docs/ref-whatwg.txt
@ -2,8 +2,23 @@
 Web Hypertext Application Technology Working Group
    WHATWG
-I don't think we need to worry about them.  Untrusted users shouldn't be
+== HTML 5 ==
 submitting applications, eh?  But if some interesting attribute pops up in
 their spec, and might be worth supporting, stick it here.
-HTML 5!!!
+URL: http://www.whatwg.org/specs/web-apps/current-work/
 HTML 5 defines a kaboodle of new elements and attributes, as well as
 some well-defined, "quirks mode" HTML parsing.  Although WHATWG professes
 to be targeted towards web applications, many of their semantic additions
 would be quite useful in regular documents. Eventually, HTML
 Purifier will need to audit their lists and figure out what changes need
 to be made.  This process is complicated by the fact that the WHATWG
 doesn't buy into W3C's modularization of XHTML 1.1: we may need
 to remodularize HTML 5 (probably done by section name). No sense in
 committing ourselves till the spec stabilizes, though.
 More immediately speaking though, however, is the well-defined parsing
 behavior that HTML 5 adds. While I have little interest in writing
 another DirectLex parser, other parsers like ph5p 
 <http://jero.net/lab/ph5p/> can be adapted to DOMLex to support much more
 flexible HTML parsing (a cool feature I've seen is how they resolve
 <b>bold<i>both</b>italic</i>).