0
0
mirror of https://github.com/ezyang/htmlpurifier.git synced 2024-11-09 15:28:40 +00:00

Some small doc updates

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1419 48356398-32a2-884e-a903-53898d9a118a
This commit is contained in:
Edward Z. Yang 2007-09-25 02:42:35 +00:00
parent 1f9a6ba30e
commit 5f76796e14
2 changed files with 75 additions and 58 deletions

100
INSTALL
View File

@ -2,62 +2,52 @@
Install Install
How to install HTML Purifier How to install HTML Purifier
HTML Purifier is designed to run out of the box, so actually using the library HTML Purifier is designed to run out of the box, so actually using the
is extremely easy. (Although, if you were looking for a step-by-step library is extremely easy. (Although... if you were looking for a
installation GUI, you've come to the wrong place!) The impatient can scroll step-by-step installation GUI, you've downloaded the wrong software!)
down to the bottom of this INSTALL document to see the code, but you really
should make sure a few things are properly done.
While the impatient can get going immediately with some of the sample
code at the bottom of this library, it's well worth performing some
basic sanity checks to get the most out of this library.
---------------------------------------------------------------------------
1. Compatibility 1. Compatibility
HTML Purifier works in both PHP 4 and PHP 5, from PHP 4.3.2 and up. It has no HTML Purifier works in both PHP 4 and PHP 5, from PHP 4.3.2 and up. It has
core dependencies with other libraries. no core dependencies with other libraries. PHP 4 support will be
deprecated on December 31, 2007, at which time only essential security
fixes will be issued for the PHP 4 version until August 8, 2008.
Optional extensions are iconv (usually installed) and tidy (also common). These optional extensions can enhance the capabilities of HTML Purifier:
If you use UTF-8 and don't plan on pretty-printing HTML, you can get away with
not having either of these extensions.
* iconv : Converts text to and from non-UTF-8 encodings
* tidy : Used for pretty-printing HTML
---------------------------------------------------------------------------
2. Reconnaissance
2. Including the library A big plus of HTML Purifier is its inerrant support of standards, so
your web-pages should be standards-compliant. (They should also use
semantic markup, but that's another issue altogether, one HTML Purifier
cannot fix without reading your mind.)
Simply use: HTML Purifier can process these doctypes:
require_once '/path/to/library/HTMLPurifier.auto.php';
...and you're good to go. Since HTML Purifier's codebase is fairly
large, I recommend only including HTML Purifier when you need it.
If you don't like your include_path to be fiddled around with, simply set
HTML Purifier's library/ directory to the include path yourself and then:
require_once 'HTMLPurifier.php';
Only the contents in the library/ folder are necessary, so you can remove
everything else when using HTML Purifier in a production environment.
3. Preparing the proper output environment
HTML Purifier is all about web-standards, so accordingly your webpages should
be standards compliant. HTML Purifier can deal with these doctypes:
* XHTML 1.0 Transitional (default) * XHTML 1.0 Transitional (default)
* XHTML 1.0 Strict * XHTML 1.0 Strict
* HTML 4.01 Transitional * HTML 4.01 Transitional
* HTML 4.01 Strict * HTML 4.01 Strict
* XHTML 1.1 (sans Ruby) * XHTML 1.1
...and these character encodings: ...and these character encodings:
* UTF-8 (default) * UTF-8 (default)
* Any encoding iconv supports (support is crippled for i18n though) * Any encoding iconv supports (but crippled internationalization support)
The defaults are there for a reason: they are best-practice choices that These defaults reflect what my choices where be if I were authoring an
should not be changed lightly. For those of you in the dark, you can determine HTML document, however, what you choose depends on the nature of your
the doctype from this code in your HTML documents: codebase. If you don't know what doctype you are using, you can determine
the doctype from this identifier at the top of your source code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
@ -66,14 +56,32 @@ the doctype from this code in your HTML documents:
<meta http-equiv="Content-type" content="text/html;charset=ENCODING"> <meta http-equiv="Content-type" content="text/html;charset=ENCODING">
For legacy codebases these declarations may be missing. If that is the case, If the character encoding declaration is missing, STOP NOW, and
STOP, and read docs/enduser-utf8.html read 'docs/enduser-utf8.html' (web accessible at
http://htmlpurifier.org/docs/enduser-utf8.html). In fact, even if it is
You may currently be vulnerable to XSS and other security threats, and HTML present, read that anyway: most websites specify character encoding
Purifier won't be able to fix that. incorrectly.
---------------------------------------------------------------------------
3. Including the library
The procedure is quite simple:
require_once '/path/to/library/HTMLPurifier.auto.php';
I recommend only including HTML Purifier when you need it, because that
call represents the inclusion of a lot of PHP files.
If you don't like your include_path to be fiddled around with, simply set
HTML Purifier's library/ directory to the include path yourself and then:
require_once 'HTMLPurifier.php';
Only the contents in the library/ folder are necessary, so you can remove
everything else when using HTML Purifier in a production environment.
---------------------------------------------------------------------------
4. Configuration 4. Configuration
HTML Purifier is designed to run out-of-the-box, but occasionally HTML HTML Purifier is designed to run out-of-the-box, but occasionally HTML
@ -143,9 +151,9 @@ but they can help out for those of you who like to exert maximum control over
your code. Some of the more interesting ones are configurable at the your code. Some of the more interesting ones are configurable at the
demo <http://htmlpurifier.org/demo.php> and are well worth looking into demo <http://htmlpurifier.org/demo.php> and are well worth looking into
for your own system. for your own system.
---------------------------------------------------------------------------
5. Using the code 5. Using the code
The interface is mind-numbingly simple: The interface is mind-numbingly simple:
@ -163,7 +171,7 @@ different though). Also, docs/enduser-slow.html gives advice on what to
do if HTML Purifier is slowing down your application. do if HTML Purifier is slowing down your application.
---------------------------------------------------------------------------
6. Quick install 6. Quick install
First, make sure library/HTMLPurifier/DefinitionCache/Serializer is First, make sure library/HTMLPurifier/DefinitionCache/Serializer is
@ -191,7 +199,7 @@ If your website is in a different encoding or doctype, use this code:
?> ?>
---------------------------------------------------------------------------
7. Caching 7. Caching
HTML Purifier generates some cache files (generally one or two) to speed up HTML Purifier generates some cache files (generally one or two) to speed up

View File

@ -23,7 +23,7 @@
/* /*
HTML Purifier 2.1.2 - Standards Compliant HTML Filtering HTML Purifier 2.1.2 - Standards Compliant HTML Filtering
Copyright (C) 2006 Edward Z. Yang Copyright (C) 2006-2007 Edward Z. Yang
This library is free software; you can redistribute it and/or This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public modify it under the terms of the GNU Lesser General Public
@ -43,9 +43,8 @@
// constants are slow, but we'll make one exception // constants are slow, but we'll make one exception
define('HTMLPURIFIER_PREFIX', dirname(__FILE__)); define('HTMLPURIFIER_PREFIX', dirname(__FILE__));
// almost every class has an undocumented dependency to these, so make sure // every class has an undocumented dependency to these, must be included!
// they get included require_once 'HTMLPurifier/ConfigSchema.php'; // fatal errors if not included
require_once 'HTMLPurifier/ConfigSchema.php'; // important
require_once 'HTMLPurifier/Config.php'; require_once 'HTMLPurifier/Config.php';
require_once 'HTMLPurifier/Context.php'; require_once 'HTMLPurifier/Context.php';
@ -60,16 +59,23 @@ require_once 'HTMLPurifier/LanguageFactory.php';
HTMLPurifier_ConfigSchema::define( HTMLPurifier_ConfigSchema::define(
'Core', 'CollectErrors', false, 'bool', ' 'Core', 'CollectErrors', false, 'bool', '
Whether or not to collect errors found while filtering the document. This Whether or not to collect errors found while filtering the document. This
is a useful way to give feedback to your users. CURRENTLY NOT IMPLEMENTED. is a useful way to give feedback to your users. <strong>Warning:</strong>
This directive has been available since 2.0.0. Currently this feature is very patchy and experimental, with lots of
possible error messages not yet implemented. It will not cause any problems,
but it may not help your users either. This directive has been available
since 2.0.0.
'); ');
/** /**
* Main library execution class. * Facade that coordinates HTML Purifier's subsystems in order to purify HTML.
* *
* Facade that performs calls to the HTMLPurifier_Lexer, * @note There are several points in which configuration can be specified
* HTMLPurifier_Strategy and HTMLPurifier_Generator subsystems in order to * for HTML Purifier. The precedence of these (from lowest to
* purify HTML. * highest) is as follows:
* -# Instance: new HTMLPurifier($config)
* -# Invocation: purify($html, $config)
* These configurations are entirely independent of each other and
* are *not* merged.
* *
* @todo We need an easier way to inject strategies, it'll probably end * @todo We need an easier way to inject strategies, it'll probably end
* up getting done through config though. * up getting done through config though.
@ -80,12 +86,13 @@ class HTMLPurifier
var $version = '2.1.2'; var $version = '2.1.2';
var $config; var $config;
var $filters; var $filters = array();
var $strategy, $generator; var $strategy, $generator;
/** /**
* Final HTMLPurifier_Context of last run purification. Might be an array. * Resultant HTMLPurifier_Context of last run purification. Is an array
* of contexts if the last called method was purifyArray().
* @public * @public
*/ */
var $context; var $context;
@ -198,6 +205,8 @@ class HTMLPurifier
/** /**
* Singleton for enforcing just one HTML Purifier in your system * Singleton for enforcing just one HTML Purifier in your system
* @param $prototype Optional prototype HTMLPurifier instance to
* overload singleton with.
*/ */
function &getInstance($prototype = null) { function &getInstance($prototype = null) {
static $htmlpurifier; static $htmlpurifier;