0
0
mirror of https://github.com/ezyang/htmlpurifier.git synced 2024-12-22 16:31:53 +00:00

Update INSTALL file with better instructions. Translation needs updating.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1439 48356398-32a2-884e-a903-53898d9a118a
This commit is contained in:
Edward Z. Yang 2007-11-05 03:40:32 +00:00
parent 1274cfed49
commit 8cd1806ec8

116
INSTALL
View File

@ -10,6 +10,7 @@ While the impatient can get going immediately with some of the sample
code at the bottom of this library, it's well worth performing some code at the bottom of this library, it's well worth performing some
basic sanity checks to get the most out of this library. basic sanity checks to get the most out of this library.
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
1. Compatibility 1. Compatibility
@ -23,6 +24,7 @@ These optional extensions can enhance the capabilities of HTML Purifier:
* iconv : Converts text to and from non-UTF-8 encodings * iconv : Converts text to and from non-UTF-8 encodings
* tidy : Used for pretty-printing HTML * tidy : Used for pretty-printing HTML
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
2. Reconnaissance 2. Reconnaissance
@ -42,7 +44,7 @@ HTML Purifier can process these doctypes:
...and these character encodings: ...and these character encodings:
* UTF-8 (default) * UTF-8 (default)
* Any encoding iconv supports (but crippled internationalization support) * Any encoding iconv supports (with crippled internationalization support)
These defaults reflect what my choices where be if I were authoring an These defaults reflect what my choices where be if I were authoring an
HTML document, however, what you choose depends on the nature of your HTML document, however, what you choose depends on the nature of your
@ -59,8 +61,9 @@ the doctype from this identifier at the top of your source code:
If the character encoding declaration is missing, STOP NOW, and If the character encoding declaration is missing, STOP NOW, and
read 'docs/enduser-utf8.html' (web accessible at read 'docs/enduser-utf8.html' (web accessible at
http://htmlpurifier.org/docs/enduser-utf8.html). In fact, even if it is http://htmlpurifier.org/docs/enduser-utf8.html). In fact, even if it is
present, read that anyway: most websites specify character encoding present, read this document anyway, as most websites specify character
incorrectly. encoding incorrectly.
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
3. Including the library 3. Including the library
@ -70,7 +73,8 @@ The procedure is quite simple:
require_once '/path/to/library/HTMLPurifier.auto.php'; require_once '/path/to/library/HTMLPurifier.auto.php';
I recommend only including HTML Purifier when you need it, because that I recommend only including HTML Purifier when you need it, because that
call represents the inclusion of a lot of PHP files. call represents the inclusion of a lot of PHP files which constitute
the bulk of HTML Purifier's memory usage.
If you don't like your include_path to be fiddled around with, simply set If you don't like your include_path to be fiddled around with, simply set
HTML Purifier's library/ directory to the include path yourself and then: HTML Purifier's library/ directory to the include path yourself and then:
@ -98,7 +102,6 @@ object and read on:
$config = HTMLPurifier_Config::createDefault(); $config = HTMLPurifier_Config::createDefault();
4.1. Setting a different character encoding 4.1. Setting a different character encoding
You really shouldn't use any other encoding except UTF-8, especially if you You really shouldn't use any other encoding except UTF-8, especially if you
@ -125,7 +128,6 @@ but please be cognizant of the issues the "solution" creates (for this
reason, I do not include the solution in this document). reason, I do not include the solution in this document).
4.2. Setting a different doctype 4.2. Setting a different doctype
For those of you using HTML 4.01 Transitional, you can disable For those of you using HTML 4.01 Transitional, you can disable
@ -142,7 +144,6 @@ Other supported doctypes include:
* XHTML 1.1 * XHTML 1.1
4.3. Other settings 4.3. Other settings
There are more configuration directives which can be read about There are more configuration directives which can be read about
@ -152,55 +153,24 @@ your code. Some of the more interesting ones are configurable at the
demo <http://htmlpurifier.org/demo.php> and are well worth looking into demo <http://htmlpurifier.org/demo.php> and are well worth looking into
for your own system. for your own system.
For example, you can fine tune allowed elements and attributes, convert
relative URLs to absolute ones, and even autoparagraph input text! These
are, respectively, %HTML.Allowed, %URI.MakeAbsolute and %URI.Base, and
%AutoFormat.AutoParagraph. The %Namespace.Directive naming convention
translates to:
--------------------------------------------------------------------------- $config->set('Namespace', 'Directive', $value);
5. Using the code
The interface is mind-numbingly simple: E.g.
$purifier = new HTMLPurifier(); $config->set('HTML', 'Allowed', 'p,b,a[href],i');
$clean_html = $purifier->purify( $dirty_html ); $config->set('URI', 'Base', 'http://www.example.com');
$config->set('URI', 'MakeAbsolute', true);
...or, if you're using the configuration object: $config->set('AutoFormat', 'AutoParagraph', true);
$purifier = new HTMLPurifier($config);
$clean_html = $purifier->purify( $dirty_html );
That's it! For more examples, check out docs/examples/ (they aren't very
different though). Also, docs/enduser-slow.html gives advice on what to
do if HTML Purifier is slowing down your application.
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
6. Quick install 5. Caching
First, make sure library/HTMLPurifier/DefinitionCache/Serializer is
writable by the webserver (see Section 7: Caching below for details).
If your website is in UTF-8 and XHTML Transitional, use this code:
<?php
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
$purifier = new HTMLPurifier();
$clean_html = $purifier->purify($dirty_html);
?>
If your website is in a different encoding or doctype, use this code:
<?php
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault();
$config->set('Core', 'Encoding', 'ISO-8859-1'); // replace with your encoding
$config->set('HTML', 'Doctype', 'HTML 4.01 Transitional'); // replace with your doctype
$purifier = new HTMLPurifier($config);
$clean_html = $purifier->purify($dirty_html);
?>
---------------------------------------------------------------------------
7. Caching
HTML Purifier generates some cache files (generally one or two) to speed up HTML Purifier generates some cache files (generally one or two) to speed up
its execution. For maximum performance, make sure that its execution. For maximum performance, make sure that
@ -236,3 +206,49 @@ Or move the cache directory somewhere else (no trailing slash):
$config->set('Cache', 'SerializerPath', '/home/user/absolute/path'); $config->set('Cache', 'SerializerPath', '/home/user/absolute/path');
---------------------------------------------------------------------------
6. Using the code
The interface is mind-numbingly simple:
$purifier = new HTMLPurifier();
$clean_html = $purifier->purify( $dirty_html );
...or, if you're using the configuration object:
$purifier = new HTMLPurifier($config);
$clean_html = $purifier->purify( $dirty_html );
That's it! For more examples, check out docs/examples/ (they aren't very
different though). Also, docs/enduser-slow.html gives advice on what to
do if HTML Purifier is slowing down your application.
---------------------------------------------------------------------------
7. Quick install
First, make sure library/HTMLPurifier/DefinitionCache/Serializer is
writable by the webserver (see Section 5: Caching above for details).
If your website is in UTF-8 and XHTML Transitional, use this code:
<?php
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
$purifier = new HTMLPurifier();
$clean_html = $purifier->purify($dirty_html);
?>
If your website is in a different encoding or doctype, use this code:
<?php
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault();
$config->set('Core', 'Encoding', 'ISO-8859-1'); // replace with your encoding
$config->set('HTML', 'Doctype', 'HTML 4.01 Transitional'); // replace with your doctype
$purifier = new HTMLPurifier($config);
$clean_html = $purifier->purify($dirty_html);
?>