Update docs.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@700 48356398-32a2-884e-a903-53898d9a118a
2024-12-22 08:21:52 +00:00 · 2007-01-29 17:53:54 +00:00 · 2007-01-29 17:53:54 +00:00 · be264a4b20
commit be264a4b20
parent 01c85b71d2
2 changed files with 11 additions and 39 deletions
--- a/docs/enduser-overview.txt
+++ b/docs/enduser-overview.txt
@ -36,7 +36,7 @@ forgiving lexer.  You may also be interested in the unit tests located in the
 tests/ folder, which provide a living document on how exactly the filter deals
 with malformed input.

-In summary:
+In summary (see corresponding classes for more details):

 1. Parse document into an array of tag and text tokens (Lexer)
 2. Remove all elements not on whitelist and transform certain other elements
--- a/docs/enduser-security.txt
+++ b/docs/enduser-security.txt
@ -6,45 +6,17 @@ through negligence of people. This class will do its job: no more, no less,
 and it's up to you to provide it the proper information and proper context
 to be effective. Things to remember:

-1. Character Encoding: UTF-8.
-    This segment will soon be obsoleted by enduser-utf8.html
-Currently, the parser runs under the assumption that it is dealing
-with UTF-8. Not ISO-8859-1 or Windows-1252, UTF-8. And definitely not "no
-character encoding explicitly stated" or UTF-7. If you're not using UTF-8 as
-your character encoding, make sure you configure HTML Purifier or switch
-to UTF-8. Now. Also, make sure any input is properly converted to UTF-8, or
-the parser will mangle it badly (though it won't be a security risk if you're
-outputting it as UTF-8 though).  Character encoding is, in general, a knotty
-issue, but do yourself a favor and learn about it:
-<http://www.joelonsoftware.com/articles/Unicode.html>
+1. Character Encoding: see enduser-utf8.html for more info.

-2. Doctype: XHTML 1.0 Transitional
-This is what the parser is outputting. For the most
-part, it's compatible with HTML 4.01, but XHTML enforces some very nice things
-that all web developers should use. Regardless, NO DOCTYPE is a NO. Quirks mode
-has waaaay too many quirks for a little parser to handle.  We did not select
-strict in order to prevent ourselves from being too draconic on users, but
-this may be configurable in the future.  Do you want standards compliance?
-The doctype is a good place to start.
+2. Doctype: document pending feature completion
+Not strictly necessary, actually. More in-depth discussion once we figure
+out how to get strict loose mode working.

-3. IDs
-    This segment is obsoleted by enduser-id.html
-They need to be unique, but without some knowledge of the
-rest of the document, it's difficult to know what's unique. %Attr.IDBlacklist
-needs to be set: we may want to consider disallowing IDs by default to
-save lazy programmers.
+3. IDs: see enduser-id.html for more info

-4. [PROJECTED] Links
-We're not going to try for spam protection (although
-some hooks for such a module might be nice) but we may offer the ability to
-only accept relative URLs. Pick the one that's right for you.
+4. Links: document pending feature completion
+Rudimentary blacklisting, we should also allow only relative URIs. We
+need a doc to explain the stuff.

-5. CSS
-While we can prevent the most flagrant cases from affecting your
-layout (such as absolutely positioned elements), no amount of code is going
-to protect your pages from being attacked by garish colors and plain old
-bad taste.  A neat feature would be the ability to define acceptable colors
-in a document, but that's not likely to be implemented for a while.  In the
-meantime, be sure to make sure that floated elements (permitted, since they
-can be quite useful) can't mess up your layout. Once again, we may want to
-disable this by default to protect lazy developers.
+5. CSS: document pending
+Explain which CSS styles we blocked and why.