Update docs.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@700 48356398-32a2-884e-a903-53898d9a118a
2024-12-22 16:31:53 +00:00 · 2007-01-29 17:53:54 +00:00 · 2007-01-29 17:53:54 +00:00 · be264a4b20
commit be264a4b20
parent 01c85b71d2
2 changed files with 11 additions and 39 deletions
--- a/docs/enduser-overview.txt
+++ b/docs/enduser-overview.txt
@ -36,7 +36,7 @@ forgiving lexer.  You may also be interested in the unit tests located in the
 tests/ folder, which provide a living document on how exactly the filter deals
 with malformed input.
-In summary:
+In summary (see corresponding classes for more details):
 1. Parse document into an array of tag and text tokens (Lexer)
 2. Remove all elements not on whitelist and transform certain other elements
--- a/docs/enduser-security.txt
+++ b/docs/enduser-security.txt
@ -6,45 +6,17 @@ through negligence of people. This class will do its job: no more, no less,
 and it's up to you to provide it the proper information and proper context
 to be effective. Things to remember:
-1. Character Encoding: UTF-8.
+1. Character Encoding: see enduser-utf8.html for more info.
    This segment will soon be obsoleted by enduser-utf8.html
 Currently, the parser runs under the assumption that it is dealing
 with UTF-8. Not ISO-8859-1 or Windows-1252, UTF-8. And definitely not "no
 character encoding explicitly stated" or UTF-7. If you're not using UTF-8 as
 your character encoding, make sure you configure HTML Purifier or switch
 to UTF-8. Now. Also, make sure any input is properly converted to UTF-8, or
 the parser will mangle it badly (though it won't be a security risk if you're
 outputting it as UTF-8 though).  Character encoding is, in general, a knotty
 issue, but do yourself a favor and learn about it:
 <http://www.joelonsoftware.com/articles/Unicode.html>
-2. Doctype: XHTML 1.0 Transitional
+2. Doctype: document pending feature completion
-This is what the parser is outputting. For the most
+Not strictly necessary, actually. More in-depth discussion once we figure
-part, it's compatible with HTML 4.01, but XHTML enforces some very nice things
+out how to get strict loose mode working.
 that all web developers should use. Regardless, NO DOCTYPE is a NO. Quirks mode
 has waaaay too many quirks for a little parser to handle.  We did not select
 strict in order to prevent ourselves from being too draconic on users, but
 this may be configurable in the future.  Do you want standards compliance?
 The doctype is a good place to start.
-3. IDs
+3. IDs: see enduser-id.html for more info
    This segment is obsoleted by enduser-id.html
 They need to be unique, but without some knowledge of the
 rest of the document, it's difficult to know what's unique. %Attr.IDBlacklist
 needs to be set: we may want to consider disallowing IDs by default to
 save lazy programmers.
-4. [PROJECTED] Links
+4. Links: document pending feature completion
-We're not going to try for spam protection (although
+Rudimentary blacklisting, we should also allow only relative URIs. We
-some hooks for such a module might be nice) but we may offer the ability to
+need a doc to explain the stuff.
 only accept relative URLs. Pick the one that's right for you.
-5. CSS
+5. CSS: document pending
-While we can prevent the most flagrant cases from affecting your
+Explain which CSS styles we blocked and why.
 layout (such as absolutely positioned elements), no amount of code is going
 to protect your pages from being attacked by garish colors and plain old
 bad taste.  A neat feature would be the ability to define acceptable colors
 in a document, but that's not likely to be implemented for a while.  In the
 meantime, be sure to make sure that floated elements (permitted, since they
 can be quite useful) can't mess up your layout. Once again, we may want to
 disable this by default to protect lazy developers.