Release 2.1.3, merged in 1404 to HEAD.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/strict@1444 48356398-32a2-884e-a903-53898d9a118a
2025-01-23 13:51:54 +00:00 · 2007-11-06 04:34:33 +00:00 · 2007-11-06 04:34:33 +00:00 · 9db861e356
commit 9db861e356
parent b3f0e6c86c
47 changed files with 5526 additions and 278 deletions
--- a/1100
+++ b/1100
--- a/237
+++ b/237
@ -1,34 +1,81 @@
+
 Install
    How to install HTML Purifier

-HTML Purifier is designed to run out of the box,  so actually using the library
-is extremely easy. (Although, if you were looking for a step-by-step
-installation GUI, you've come to the wrong place!)  The impatient can scroll
-down to the bottom of this INSTALL document to see the code, but you really
-should make sure a few things are properly done.
-
-
+HTML Purifier is designed to run out of the box, so actually using the 
+library is extremely easy.  (Although... if you were looking for a 
+step-by-step installation GUI, you've downloaded the wrong software!)
+
+While the impatient can get going immediately with some of the sample
+code at the bottom of this library, it's well worth performing some
+basic sanity checks to get the most out of this library.


+---------------------------------------------------------------------------
 1.  Compatibility

-HTML Purifier works in both PHP 4 and PHP 5, from PHP 4.3.2 and up. It has no
-core dependencies with other libraries.
+HTML Purifier works in both PHP 4 and PHP 5, and is actively tested from 
+PHP 4.3.7 and up (see tests/multitest.php for specific versions). It has 
+no core dependencies with other libraries. PHP 4 support will be 
+deprecated on December 31, 2007, at which time only essential security 
+fixes will be issued for the PHP 4 version until August 8, 2008. 

-Optional extensions are iconv (usually installed) and tidy (also common).
-If you use UTF-8 and don't plan on pretty-printing HTML, you can get away with
-not having either of these extensions.
+These optional extensions can enhance the capabilities of HTML Purifier:
+
+    * iconv : Converts text to and from non-UTF-8 encodings
+    * tidy  : Used for pretty-printing HTML


+---------------------------------------------------------------------------
+2.  Reconnaissance

-2.  Including the library
+A big plus of HTML Purifier is its inerrant support of standards, so
+your web-pages should be standards-compliant.  (They should also use
+semantic markup, but that's another issue altogether, one HTML Purifier
+cannot fix without reading your mind.)

-Simply use:
+HTML Purifier can process these doctypes:
+
+* XHTML 1.0 Transitional (default)
+* XHTML 1.0 Strict
+* HTML 4.01 Transitional
+* HTML 4.01 Strict
+* XHTML 1.1
+
+...and these character encodings:
+
+* UTF-8 (default)
+* Any encoding iconv supports (with crippled internationalization support)
+
+These defaults reflect what my choices where be if I were authoring an
+HTML document, however, what you choose depends on the nature of your
+codebase.  If you don't know what doctype you are using, you can determine
+the doctype from this identifier at the top of your source code:
+
+    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+
+...and the character encoding from this code:
+
+    <meta http-equiv="Content-type" content="text/html;charset=ENCODING">
+
+If the character encoding declaration is missing, STOP NOW, and
+read 'docs/enduser-utf8.html' (web accessible at
+http://htmlpurifier.org/docs/enduser-utf8.html).  In fact, even if it is
+present, read this document anyway, as most websites specify character
+encoding incorrectly.
+
+
+---------------------------------------------------------------------------
+3.  Including the library
+
+The procedure is quite simple:

    require_once '/path/to/library/HTMLPurifier.auto.php';

-...and you're good to go.  Since HTML Purifier's codebase is fairly
-large, I recommend only including HTML Purifier when you need it.
+I recommend only including HTML Purifier when you need it, because that
+call represents the inclusion of a lot of PHP files which constitute
+the bulk of HTML Purifier's memory usage.

 If you don't like your include_path to be fiddled around with, simply set
 HTML Purifier's library/ directory to the include path yourself and then:
@ -39,46 +86,7 @@ Only the contents in the library/ folder are necessary, so you can remove
 everything else when using HTML Purifier in a production environment. 


-
-3.  Preparing the proper output environment
-
-HTML Purifier is all about web-standards, so accordingly your webpages should
-be standards compliant.  HTML Purifier can deal with these doctypes:
-
-* XHTML 1.0 Transitional (default)
-* XHTML 1.0 Strict
-* HTML 4.01 Transitional
-* HTML 4.01 Strict
-* XHTML 1.1 (sans Ruby)
-
-...and these character encodings:
-
-* UTF-8 (default)
-* Any encoding iconv supports (support is crippled for i18n though)
-
-The defaults are there for a reason: they are best-practice choices that
-should not be changed lightly.  For those of you in the dark, you can determine
-the doctype from this code in your HTML documents:
-
-    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
-        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-
-...and the character encoding from this code:
-
-    <meta http-equiv="Content-type" content="text/html;charset=ENCODING">
-
-For legacy codebases these declarations may be missing.  If that is the case,
-STOP, and read docs/enduser-utf8.html
-
-
-
-
-
-You may currently be vulnerable to XSS and other security threats, and HTML
-Purifier won't be able to fix that.
-
-
-
+---------------------------------------------------------------------------
 4. Configuration

 HTML Purifier is designed to run out-of-the-box, but occasionally HTML
@ -95,7 +103,6 @@ object and read on:
    $config = HTMLPurifier_Config::createDefault();


-
 4.1. Setting a different character encoding

 You really shouldn't use any other encoding except UTF-8, especially if you
@ -122,10 +129,6 @@ but please be cognizant of the issues the "solution" creates (for this
 reason, I do not include the solution in this document).


-
-
-
-
 4.2. Setting a different doctype

 For those of you using HTML 4.01 Transitional, you can disable
@ -135,7 +138,6 @@ XHTML output like this:

 Other supported doctypes include:

-
    * HTML 4.01 Strict
    * HTML 4.01 Transitional
    * XHTML 1.0 Strict
@ -143,7 +145,6 @@ Other supported doctypes include:
    * XHTML 1.1


-
 4.3. Other settings

 There are more configuration directives which can be read about
@ -153,55 +154,24 @@ your code.  Some of the more interesting ones are configurable at the
 demo <http://htmlpurifier.org/demo.php> and are well worth looking into
 for your own system.

+For example, you can fine tune allowed elements and attributes, convert
+relative URLs to absolute ones, and even autoparagraph input text! These
+are, respectively, %HTML.Allowed, %URI.MakeAbsolute and %URI.Base, and
+%AutoFormat.AutoParagraph. The %Namespace.Directive naming convention
+translates to:
+
+    $config->set('Namespace', 'Directive', $value);
+
+E.g.
+
+    $config->set('HTML', 'Allowed', 'p,b,a[href],i');
+    $config->set('URI', 'Base', 'http://www.example.com');
+    $config->set('URI', 'MakeAbsolute', true);
+    $config->set('AutoFormat', 'AutoParagraph', true);


-5.   Using the code
-
-The interface is mind-numbingly simple:
-
-    $purifier = new HTMLPurifier();
-    $clean_html = $purifier->purify( $dirty_html );
-
-...or, if you're using the configuration object:
-
-    $purifier = new HTMLPurifier($config);
-    $clean_html = $purifier->purify( $dirty_html );
-
-That's it!  For more examples, check out docs/examples/ (they aren't very
-different though).  Also, docs/enduser-slow.html gives advice on what to
-do if HTML Purifier is slowing down your application.
-
-
-
-6.   Quick install
-
-First, make sure library/HTMLPurifier/DefinitionCache/Serializer is
-writable by the webserver (see Section 7: Caching below for details).
-If your website is in UTF-8 and XHTML Transitional, use this code:
-
-<?php
-    require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
-    
-    $purifier = new HTMLPurifier();
-    $clean_html = $purifier->purify($dirty_html);
-?>
-
-If your website is in a different encoding or doctype, use this code:
-
-<?php
-    require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
-    
-    $config = HTMLPurifier_Config::createDefault();
-    $config->set('Core', 'Encoding', 'ISO-8859-1'); // replace with your encoding
-    $config->set('HTML', 'Doctype', 'HTML 4.01 Transitional'); // replace with your doctype
-    $purifier = new HTMLPurifier($config);
-    
-    $clean_html = $purifier->purify($dirty_html);
-?>
-
-
-
-7. Caching
+---------------------------------------------------------------------------
+5. Caching

 HTML Purifier generates some cache files (generally one or two) to speed up
 its execution. For maximum performance, make sure that
@ -236,3 +206,50 @@ hit):
 Or move the cache directory somewhere else (no trailing slash):

    $config->set('Cache', 'SerializerPath', '/home/user/absolute/path');
+
+
+---------------------------------------------------------------------------
+6.   Using the code
+
+The interface is mind-numbingly simple:
+
+    $purifier = new HTMLPurifier();
+    $clean_html = $purifier->purify( $dirty_html );
+
+...or, if you're using the configuration object:
+
+    $purifier = new HTMLPurifier($config);
+    $clean_html = $purifier->purify( $dirty_html );
+
+That's it!  For more examples, check out docs/examples/ (they aren't very
+different though).  Also, docs/enduser-slow.html gives advice on what to
+do if HTML Purifier is slowing down your application.
+
+
+---------------------------------------------------------------------------
+7.   Quick install
+
+First, make sure library/HTMLPurifier/DefinitionCache/Serializer is
+writable by the webserver (see Section 5: Caching above for details).
+If your website is in UTF-8 and XHTML Transitional, use this code:
+
+<?php
+    require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
+    
+    $purifier = new HTMLPurifier();
+    $clean_html = $purifier->purify($dirty_html);
+?>
+
+If your website is in a different encoding or doctype, use this code:
+
+<?php
+    require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
+    
+    $config = HTMLPurifier_Config::createDefault();
+    $config->set('Core', 'Encoding', 'ISO-8859-1'); // replace with your encoding
+    $config->set('HTML', 'Doctype', 'HTML 4.01 Transitional'); // replace with your doctype
+    $purifier = new HTMLPurifier($config);
+    
+    $clean_html = $purifier->purify($dirty_html);
+?>
+
--- a/49
+++ b/49
@ -9,6 +9,55 @@ NEWS ( CHANGELOG and HISTORY )                                     HTMLPurifier
    . Internal change
 ==========================

+2.1.3, released 2007-11-05
+! tests/multitest.php allows you to test multiple versions by running
+  tests/index.php through multiple interpreters using `phpv` shell
+  script (you must provide this script!)
+- Fixed poor include ordering for Email URI AttrDefs, causes fatal errors
+  on some systems.
+- Injector algorithm further refined: off-by-one error regarding skip 
+  counts for dormant injectors fixed
+- Corrective blockquote definition now enabled for HTML 4.01 Strict
+- Fatal error when <img> tag (or any other element with required attributes)
+  has 'id' attribute fixed, thanks NykO18 for reporting
+- Fix warning emitted when a non-supported URI scheme is passed to the
+  MakeAbsolute URIFilter, thanks NykO18 (again)
+- Further refine AutoParagraph injector. Behavior inside of elements
+  allowing paragraph tags clarified: only inline content delimeted by
+  double newlines (not block elements) are paragraphed.
+- Buggy treatment of end tags of elements that have required attributes
+  fixed (does not manifest on default tag-set)
+- Spurious internal content reorganization error suppressed
+- HTMLDefinition->addElement now returns a reference to the created
+  element object, as implied by the documentation
+- Phorum mod's HTML Purifier help message expanded (unreleased elsewhere)
+- Fix a theoretical class of infinite loops from DirectLex reported
+  by Nate Abele
+- Work around unnecessary DOMElement type-cast in PH5P that caused errors
+  in PHP 5.1
+- Work around PHP 4 SimpleTest lack-of-error complaining for one-time-only
+  HTMLDefinition errors, this may indicate problems with error-collecting
+  facilities in PHP 5
+- Make ErrorCollectorEMock work in both PHP 4 and PHP 5
+- Make PH5P work with PHP 5.0 by removing unnecessary array parameter typedef
+. %Core.AcceptFullDocuments renamed to %Core.ConvertDocumentToFragment 
+  to better communicate its purpose
+. Error unit tests can now specify the expectation of no errors. Future
+  iterations of the harness will be extremely strict about what errors
+  are allowed
+. Extend Injector hooks to allow for more powerful injector routines
+. HTMLDefinition->addBlankElement created, as according to the HTMLModule
+  method
+. Doxygen configuration file updated, with minor improvements
+. Test runner now checks for similarly named files in conf/ directory too.
+. Minor cosmetic change to flush-definition-cache.php: trailing newline is
+  outputted
+. Maintenance script for generating PH5P patch added, original PH5P source
+  file also added under version control
+. Full unit test runner script title made more descriptive with PHP version
+. Updated INSTALL file to state that 4.3.7 is the earliest version we
+  are actively testing
+
 2.1.2, released 2007-09-03
 ! Implemented Object module for trusted users
 ! Implemented experimental HTML5 parsing mode using PH5P. To use, add
--- a/2
+++ b/2
@ -1 +1 @@
-2.1.2
+2.1.3
--- a/14
+++ b/14
@ -1,8 +1,6 @@
-Version 2.1.2 is a mix of experimental features and stability updates.
-Among new features: an Object module for trusted users, support for the
-CSS property 'border-spacing', and HTML 5 style parsing using PH5P.
-Bug fixes ihave resolved a few obscure issues including border-collapse:seperate,
-a DirectLex parsing error, broken HTML in printDefinition.php, and problems
-with the experimental standalone distribution. Also, there were large
-amounts of behind-the-scenes refactoring and the removal of URIScheme
-inclusion reflection.
+Stability release 2.1.3 fixes a slew of minor bugs found in HTML Purifier,
+and also includes some internal code enhancements and refactorings.
+Notably, tests/multitest.php automates testing in multiple versions, 
+fatal AttrDef_URI_Email error fixed, blockquote contents are more lenient
+in HTML 4.01 Strict and fatal errors involving ID tags in img tags were
+fixed.
--- a/library/HTMLPurifier.php
+++ b/library/HTMLPurifier.php
@ -22,8 +22,8 @@
 */

 /*
-    HTML Purifier 2.1.2 - Standards Compliant HTML Filtering
-    Copyright (C) 2006 Edward Z. Yang
+    HTML Purifier 2.1.3 - Standards Compliant HTML Filtering
+    Copyright (C) 2006-2007 Edward Z. Yang

    This library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
@ -43,9 +43,8 @@
 // constants are slow, but we'll make one exception
 define('HTMLPURIFIER_PREFIX', dirname(__FILE__));

-// almost every class has an undocumented dependency to these, so make sure
-// they get included
-require_once 'HTMLPurifier/ConfigSchema.php'; // important
+// every class has an undocumented dependency to these, must be included!
+require_once 'HTMLPurifier/ConfigSchema.php'; // fatal errors if not included
 require_once 'HTMLPurifier/Config.php';
 require_once 'HTMLPurifier/Context.php';

@ -60,16 +59,23 @@ require_once 'HTMLPurifier/LanguageFactory.php';
 HTMLPurifier_ConfigSchema::define(
    'Core', 'CollectErrors', false, 'bool', '
 Whether or not to collect errors found while filtering the document. This
-is a useful way to give feedback to your users. CURRENTLY NOT IMPLEMENTED.
-This directive has been available since 2.0.0.
+is a useful way to give feedback to your users. <strong>Warning:</strong>
+Currently this feature is very patchy and experimental, with lots of
+possible error messages not yet implemented. It will not cause any problems,
+but it may not help your users either. This directive has been available
+since 2.0.0.
 ');

 /**
- * Main library execution class.
+ * Facade that coordinates HTML Purifier's subsystems in order to purify HTML.
 * 
- * Facade that performs calls to the HTMLPurifier_Lexer,
- * HTMLPurifier_Strategy and HTMLPurifier_Generator subsystems in order to
- * purify HTML.
+ * @note There are several points in which configuration can be specified 
+ *       for HTML Purifier.  The precedence of these (from lowest to
+ *       highest) is as follows:
+ *          -# Instance: new HTMLPurifier($config)
+ *          -# Invocation: purify($html, $config)
+ *       These configurations are entirely independent of each other and
+ *       are *not* merged.
 * 
 * @todo We need an easier way to inject strategies, it'll probably end
 *       up getting done through config though.
@ -77,15 +83,16 @@ This directive has been available since 2.0.0.
 class HTMLPurifier
 {
    
-    var $version = '2.1.2';
+    var $version = '2.1.3';
    
    var $config;
-    var $filters;
+    var $filters = array();
    
    var $strategy, $generator;
    
    /**
-     * Final HTMLPurifier_Context of last run purification. Might be an array.
+     * Resultant HTMLPurifier_Context of last run purification. Is an array
+     * of contexts if the last called method was purifyArray().
     * @public
     */
    var $context;
@ -150,6 +157,11 @@ class HTMLPurifier
            $context->register('ErrorCollector', $error_collector);
        }
        
+        // setup id_accumulator context, necessary due to the fact that
+        // AttrValidator can be called from many places
+        $id_accumulator = HTMLPurifier_IDAccumulator::build($config, $context);
+        $context->register('IDAccumulator', $id_accumulator);
+        
        $html = HTMLPurifier_Encoder::convertToUTF8($html, $config, $context);
        
        for ($i = 0, $size = count($this->filters); $i < $size; $i++) {
@ -198,6 +210,8 @@ class HTMLPurifier
    
    /**
     * Singleton for enforcing just one HTML Purifier in your system
+     * @param $prototype Optional prototype HTMLPurifier instance to
+     *                   overload singleton with.
     */
    static function &getInstance($prototype = null) {
        static $htmlpurifier;
--- a/library/HTMLPurifier/AttrDef/URI.php
+++ b/library/HTMLPurifier/AttrDef/URI.php
@ -102,7 +102,7 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
            $result = $uri->validate($config, $context);
            if (!$result) break;
            
-            // chained validation
+            // chained filtering
            $uri_def =& $config->getDefinition('URI');
            $result = $uri_def->filter($uri, $config, $context);
            if (!$result) break;
--- a/library/HTMLPurifier/AttrDef/URI/Email.php
+++ b/library/HTMLPurifier/AttrDef/URI/Email.php
@ -1,7 +1,6 @@
 <?php

 require_once 'HTMLPurifier/AttrDef.php';
-require_once 'HTMLPurifier/AttrDef/URI/Email/SimpleCheck.php';

 class HTMLPurifier_AttrDef_URI_Email extends HTMLPurifier_AttrDef
 {
@ -15,3 +14,5 @@ class HTMLPurifier_AttrDef_URI_Email extends HTMLPurifier_AttrDef
    
 }

+// sub-implementations
+require_once 'HTMLPurifier/AttrDef/URI/Email/SimpleCheck.php';
--- a/library/HTMLPurifier/AttrValidator.php
+++ b/library/HTMLPurifier/AttrValidator.php
@ -23,6 +23,13 @@ class HTMLPurifier_AttrValidator
        $definition = $config->getHTMLDefinition();
        $e =& $context->get('ErrorCollector', true);
        
+        // initialize IDAccumulator if necessary
+        $ok =& $context->get('IDAccumulator', true);
+        if (!$ok) {
+            $id_accumulator = HTMLPurifier_IDAccumulator::build($config, $context);
+            $context->register('IDAccumulator', $id_accumulator);
+        }
+        
        // initialize CurrentToken if necessary
        $current_token =& $context->get('CurrentToken', true);
        if (!$current_token) $context->register('CurrentToken', $token);
--- a/library/HTMLPurifier/ChildDef/Optional.php
+++ b/library/HTMLPurifier/ChildDef/Optional.php
@ -15,7 +15,10 @@ class HTMLPurifier_ChildDef_Optional extends HTMLPurifier_ChildDef_Required
    var $type = 'optional';
    function validateChildren($tokens_of_children, $config, &$context) {
        $result = parent::validateChildren($tokens_of_children, $config, $context);
-        if ($result === false) return array();
+        if ($result === false) {
+            if (empty($tokens_of_children)) return true;
+            else return array();
+        }
        return $result;
    }
 }
--- a/library/HTMLPurifier/Config.php
+++ b/library/HTMLPurifier/Config.php
@ -42,7 +42,7 @@ class HTMLPurifier_Config
    /**
     * HTML Purifier's version
     */
-    var $version = '2.1.2';
+    var $version = '2.1.3';
    
    /**
     * Two-level associative array of configuration directives
--- a/library/HTMLPurifier/HTMLDefinition.php
+++ b/library/HTMLPurifier/HTMLDefinition.php
@ -236,13 +236,26 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
    /**
     * Adds a custom element to your HTML definition
     * @note See HTMLPurifier_HTMLModule::addElement for detailed 
-     *       parameter descriptions.
+     *       parameter and return value descriptions.
     */
-    function addElement($element_name, $type, $contents, $attr_collections, $attributes) {
+    function &addElement($element_name, $type, $contents, $attr_collections, $attributes) {
        $module =& $this->getAnonymousModule();
        // assume that if the user is calling this, the element
        // is safe. This may not be a good idea
-        $module->addElement($element_name, true, $type, $contents, $attr_collections, $attributes);
+        $element =& $module->addElement($element_name, true, $type, $contents, $attr_collections, $attributes);
+        return $element;
+    }
+    
+    /**
+     * Adds a blank element to your HTML definition, for overriding
+     * existing behavior
+     * @note See HTMLPurifier_HTMLModule::addBlankElement for detailed
+     *       parameter and return value descriptions.
+     */
+    function &addBlankElement($element_name) {
+        $module  =& $this->getAnonymousModule();
+        $element =& $module->addBlankElement($element_name);
+        return $element;
    }
    
    /**
--- a/library/HTMLPurifier/HTMLModule/Tidy/XHTMLAndHTML4.php
+++ b/library/HTMLPurifier/HTMLModule/Tidy/XHTMLAndHTML4.php
@ -13,6 +13,8 @@ require_once 'HTMLPurifier/AttrTransform/Length.php';
 require_once 'HTMLPurifier/AttrTransform/ImgSpace.php';
 require_once 'HTMLPurifier/AttrTransform/EnumToCSS.php';

+require_once 'HTMLPurifier/ChildDef/StrictBlockquote.php';
+
 class HTMLPurifier_HTMLModule_Tidy_XHTMLAndHTML4 extends
      HTMLPurifier_HTMLModule_Tidy
 {
@ -188,5 +190,17 @@ class HTMLPurifier_HTMLModule_Tidy_Strict extends
 {
    var $name = 'Tidy_Strict';
    var $defaultLevel = 'light';
+    
+    function makeFixes() {
+        $r = parent::makeFixes();
+        $r['blockquote#content_model_type'] = 'strictblockquote';
+        return $r;
+    }
+    
+    var $defines_child_def = true;
+    function getChildDef($def) {
+        if ($def->content_model_type != 'strictblockquote') return parent::getChildDef($def);
+        return new HTMLPurifier_ChildDef_StrictBlockquote($def->content_model);
+    }
 }

--- a/library/HTMLPurifier/HTMLModule/Tidy/XHTMLStrict.php
+++ b/library/HTMLPurifier/HTMLModule/Tidy/XHTMLStrict.php
@ -1,26 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/HTMLModule/Tidy.php';
-require_once 'HTMLPurifier/ChildDef/StrictBlockquote.php';
-
-class HTMLPurifier_HTMLModule_Tidy_XHTMLStrict extends
-      HTMLPurifier_HTMLModule_Tidy
-{
-    
-    var $name = 'Tidy_XHTMLStrict';
-    var $defaultLevel = 'light';
-    
-    function makeFixes() {
-        $r = array();
-        $r['blockquote#content_model_type'] = 'strictblockquote';
-        return $r;
-    }
-    
-    var $defines_child_def = true;
-    function getChildDef($def) {
-        if ($def->content_model_type != 'strictblockquote') return false;
-        return new HTMLPurifier_ChildDef_StrictBlockquote($def->content_model);
-    }
-    
-}
-
--- a/library/HTMLPurifier/HTMLModuleManager.php
+++ b/library/HTMLPurifier/HTMLModuleManager.php
@ -35,7 +35,6 @@ require_once 'HTMLPurifier/HTMLModule/Object.php';
 require_once 'HTMLPurifier/HTMLModule/Tidy.php';
 require_once 'HTMLPurifier/HTMLModule/Tidy/XHTMLAndHTML4.php';
 require_once 'HTMLPurifier/HTMLModule/Tidy/XHTML.php';
-require_once 'HTMLPurifier/HTMLModule/Tidy/XHTMLStrict.php';
 require_once 'HTMLPurifier/HTMLModule/Tidy/Proprietary.php';

 HTMLPurifier_ConfigSchema::define(
@ -209,7 +208,7 @@ class HTMLPurifier_HTMLModuleManager
        $this->doctypes->register(
            'XHTML 1.0 Strict', true,
            array_merge($common, $xml, $non_xml),
-            array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_XHTMLStrict', 'Tidy_Proprietary'),
+            array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_Strict', 'Tidy_Proprietary'),
            array(),
            '-//W3C//DTD XHTML 1.0 Strict//EN',
            'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
@ -218,7 +217,7 @@ class HTMLPurifier_HTMLModuleManager
        $this->doctypes->register(
            'XHTML 1.1', true,
            array_merge($common, $xml, array('Ruby')),
-            array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_Proprietary', 'Tidy_XHTMLStrict'), // Tidy_XHTML1_1
+            array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_Proprietary', 'Tidy_Strict'), // Tidy_XHTML1_1
            array(),
            '-//W3C//DTD XHTML 1.1//EN',
            'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'
--- a/library/HTMLPurifier/IDAccumulator.php
+++ b/library/HTMLPurifier/IDAccumulator.php
@ -1,11 +1,15 @@
 <?php

+HTMLPurifier_ConfigSchema::define(
+    'Attr', 'IDBlacklist', array(), 'list',
+    'Array of IDs not allowed in the document.'
+);
+
 /**
 * Component of HTMLPurifier_AttrContext that accumulates IDs to prevent dupes
 * @note In Slashdot-speak, dupe means duplicate.
- * @note This class does not accept $config or $context, thus, it is the
- *       burden of the callee to register the appropriate errors or
- *       configuration.
+ * @note The default constructor does not accept $config or $context objects:
+ *       use must use the static build() factory method to perform initialization.
 */
 class HTMLPurifier_IDAccumulator
 {
@ -16,6 +20,19 @@ class HTMLPurifier_IDAccumulator
     */
    var $ids = array();
    
+    /**
+     * Builds an IDAccumulator, also initializing the default blacklist
+     * @param $config Instance of HTMLPurifier_Config
+     * @param $context Instance of HTMLPurifier_Context
+     * @return Fully initialized HTMLPurifier_IDAccumulator
+     * @static
+     */
+    static function build($config, &$context) {
+        $id_accumulator = new HTMLPurifier_IDAccumulator();
+        $id_accumulator->load($config->get('Attr', 'IDBlacklist'));
+        return $id_accumulator;
+    }
+    
    /**
     * Add an ID to the lookup table.
     * @param $id ID to be added.
--- a/library/HTMLPurifier/Injector.php
+++ b/library/HTMLPurifier/Injector.php
@ -4,6 +4,9 @@
 * Injects tokens into the document while parsing for well-formedness.
 * This enables "formatter-like" functionality such as auto-paragraphing,
 * smiley-ification and linkification to take place.
+ * 
+ * @todo Allow injectors to request a re-run on their output. This 
+ *       would help if an operation is recursive.
 */
 class HTMLPurifier_Injector
 {
@ -107,5 +110,12 @@ class HTMLPurifier_Injector
     */
    function handleElement(&$token) {}
    
+    /**
+     * Notifier that is called when an end token is processed
+     * @note This differs from handlers in that the token is read-only
+     */
+    function notifyEnd($token) {}
+    
+    
 }

--- a/library/HTMLPurifier/Injector/AutoParagraph.php
+++ b/library/HTMLPurifier/Injector/AutoParagraph.php
@ -6,20 +6,28 @@ HTMLPurifier_ConfigSchema::define(
    'AutoFormat', 'AutoParagraph', false, 'bool', '
 <p>
  This directive turns on auto-paragraphing, where double newlines are
-  converted in to paragraphs whenever possible. Auto-paragraphing
-  applies when:
+  converted in to paragraphs whenever possible. Auto-paragraphing:
 </p>
 <ul>
-  <li>There are inline elements or text in the root node</li>
-  <li>There are inline elements or text with double newlines or
-      block elements in nodes that allow paragraph tags</li>
-  <li>There are double newlines in paragraph tags</li>
+  <li>Always applies to inline elements or text in the root node,</li>
+  <li>Applies to inline elements or text with double newlines in nodes
+      that allow paragraph tags,</li>
+  <li>Applies to double newlines in paragraph tags</li>
 </ul>
 <p>
  <code>p</code> tags must be allowed for this directive to take effect.
  We do not use <code>br</code> tags for paragraphing, as that is
  semantically incorrect.
 </p>
+<p>
+  To prevent auto-paragraphing as a content-producer, refrain from using
+  double-newlines except to specify a new paragraph or in contexts where
+  it has special meaning (whitespace usually has no meaning except in
+  tags like <code>pre</code>, so this should not be difficult.) To prevent
+  the paragraphing of inline text adjacent to block elements, wrap them
+  in <code>div</code> tags (the behavior is slightly different outside of
+  the root node.)
+</p>
 <p>
  This directive has been available since 2.0.1.
 </p>
@ -62,19 +70,27 @@ class HTMLPurifier_Injector_AutoParagraph extends HTMLPurifier_Injector
                $ok = false;
                // test if up-coming tokens are either block or have
                // a double newline in them
+                $nesting = 0;
                for ($i = $this->inputIndex + 1; isset($this->inputTokens[$i]); $i++) {
                    if ($this->inputTokens[$i]->type == 'start'){
                        if (!$this->_isInline($this->inputTokens[$i])) {
-                            $ok = true;
+                            // we haven't found a double-newline, and
+                            // we've hit a block element, so don't paragraph
+                            $ok = false;
+                            break;
                        }
-                        break;
+                        $nesting++;
+                    }
+                    if ($this->inputTokens[$i]->type == 'end') {
+                        if ($nesting <= 0) break;
+                        $nesting--;
                    }
-                    if ($this->inputTokens[$i]->type == 'end') break;
                    if ($this->inputTokens[$i]->type == 'text') {
+                        // found it!
                        if (strpos($this->inputTokens[$i]->data, "\n\n") !== false) {
                            $ok = true;
+                            break;
                        }
-                        if (!$this->inputTokens[$i]->is_whitespace) break;
                    }
                }
                if ($ok) {
--- a/library/HTMLPurifier/Lexer.php
+++ b/library/HTMLPurifier/Lexer.php
@ -13,11 +13,14 @@ if (version_compare(PHP_VERSION, "5", ">=")) {
 }

 HTMLPurifier_ConfigSchema::define(
-    'Core', 'AcceptFullDocuments', true, 'bool',
-    'This parameter determines whether or not the filter should accept full '.
-    'HTML documents, not just HTML fragments.  When on, it will '.
-    'drop all sections except the content between body.'
-);
+    'Core', 'ConvertDocumentToFragment', true, 'bool', '
+This parameter determines whether or not the filter should convert
+input that is a full document with html and body tags to a fragment
+of just the contents of a body tag. This parameter is simply something
+HTML Purifier can do during an edge-case: for most inputs, this
+processing is not necessary.
+');
+HTMLPurifier_ConfigSchema::defineAlias('Core', 'AcceptFullDocuments', 'Core', 'ConvertDocumentToFragment');

 HTMLPurifier_ConfigSchema::define(
    'Core', 'LexerImpl', null, 'mixed/null', '
@ -316,7 +319,7 @@ class HTMLPurifier_Lexer
    function normalize($html, $config, &$context) {
        
        // extract body from document if applicable
-        if ($config->get('Core', 'AcceptFullDocuments')) {
+        if ($config->get('Core', 'ConvertDocumentToFragment')) {
            $html = $this->extractBody($html);
        }
        
--- a/library/HTMLPurifier/Lexer/DirectLex.php
+++ b/library/HTMLPurifier/Lexer/DirectLex.php
@ -160,9 +160,15 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
                
                $segment = substr($html, $cursor, $strlen_segment);
                
+                if ($segment === false) {
+                    // somehow, we attempted to access beyond the end of
+                    // the string, defense-in-depth, reported by Nate Abele
+                    break;
+                }
+                
                // Check if it's a comment
                if (
-                    substr($segment, 0, 3) == '!--'
+                    substr($segment, 0, 3) === '!--'
                ) {
                    // re-determine segment length, looking for -->
                    $position_comment_end = strpos($html, '-->', $cursor);
@ -237,7 +243,7 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
                // trailing slash. Remember, we could have a tag like <br>, so
                // any later token processing scripts must convert improperly
                // classified EmptyTags from StartTags.
-                $is_self_closing= (strrpos($segment,'/') === $strlen_segment-1);
+                $is_self_closing = (strrpos($segment,'/') === $strlen_segment-1);
                if ($is_self_closing) {
                    $strlen_segment--;
                    $segment = substr($segment, 0, $strlen_segment);
--- a/library/HTMLPurifier/Lexer/PH5P.php
+++ b/library/HTMLPurifier/Lexer/PH5P.php
@ -26,8 +26,6 @@ class HTMLPurifier_Lexer_PH5P extends HTMLPurifier_Lexer_DOMLex {
    
 }

-// begin PHP5P source code here
-
 /*

 Copyright 2007 Jeroen van der Meer <http://jero.net/> 
@ -3722,7 +3720,7 @@ class HTML5TreeConstructer {
        }
    }

-    private function generateImpliedEndTags(array $exclude = array()) {
+    private function generateImpliedEndTags($exclude = array()) {
        /* When the steps below require the UA to generate implied end tags,
        then, if the current node is a dd element, a dt element, an li element,
        a p element, a td element, a th  element, or a tr element, the UA must
@ -3736,7 +3734,8 @@ class HTML5TreeConstructer {
        }
    }

-    private function getElementCategory($name) {
+    private function getElementCategory($node) {
+        $name = $node->tagName;
        if(in_array($name, $this->special))
            return self::SPECIAL;

@ -3884,3 +3883,4 @@ class HTML5TreeConstructer {
        return $this->dom;
    }
 }
+?>
--- a/library/HTMLPurifier/Strategy/FixNesting.php
+++ b/library/HTMLPurifier/Strategy/FixNesting.php
@ -195,7 +195,7 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
            //################################################################//
            // Process result by interpreting $result
            
-            if ($result === true) {
+            if ($result === true || $child_tokens === $result) {
                // leave the node as is
                
                // register start token as a parental node start
--- a/library/HTMLPurifier/Strategy/MakeWellFormed.php
+++ b/library/HTMLPurifier/Strategy/MakeWellFormed.php
@ -36,28 +36,23 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
        
        $definition = $config->getHTMLDefinition();
        
-        // CurrentNesting
-        $this->currentNesting = array();
-        $context->register('CurrentNesting', $this->currentNesting);
-        
-        // InputIndex
-        $this->inputIndex = false;
-        $context->register('InputIndex', $this->inputIndex);
-        
-        // InputTokens
-        $context->register('InputTokens', $tokens);
-        $this->inputTokens =& $tokens;
-        
-        // OutputTokens
+        // local variables
        $result = array();
-        $this->outputTokens =& $result;
-        
-        // %Core.EscapeInvalidTags
-        $escape_invalid_tags = $config->get('Core', 'EscapeInvalidTags');
        $generator = new HTMLPurifier_Generator();
-        
+        $escape_invalid_tags = $config->get('Core', 'EscapeInvalidTags');
        $e =& $context->get('ErrorCollector', true);
        
+        // member variables
+        $this->currentNesting = array();
+        $this->inputIndex     = false;
+        $this->inputTokens    =& $tokens;
+        $this->outputTokens   =& $result;
+        
+        // context variables
+        $context->register('CurrentNesting', $this->currentNesting);
+        $context->register('InputIndex', $this->inputIndex);
+        $context->register('InputTokens', $tokens);
+        
        // -- begin INJECTOR --
        
        $this->injectors = array();
@ -95,6 +90,10 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            trigger_error("Cannot enable $name injector because $error is not allowed", E_USER_WARNING);
        }
        
+        // warning: most foreach loops follow the convention $i => $x.
+        // be sure, for PHP4 compatibility, to only perform write operations
+        // directly referencing the object using $i: $x is only safe for reads
+        
        // -- end INJECTOR --
        
        $token = false;
@ -105,6 +104,8 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            // if all goes well, this token will be passed through unharmed
            $token = $tokens[$this->inputIndex];
            
+            //printTokens($tokens, $this->inputIndex);
+            
            foreach ($this->injectors as $i => $x) {
                if ($x->skip > 0) $this->injectors[$i]->skip--;
            }
@ -114,7 +115,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
                if ($token->type === 'text') {
                     // injector handler code; duplicated for performance reasons
                     foreach ($this->injectors as $i => $x) {
-                         if (!$x->skip) $x->handleText($token);
+                         if (!$x->skip) $this->injectors[$i]->handleText($token);
                         if (is_array($token)) {
                             $this->currentInjector = $i;
                             break;
@ -172,7 +173,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            // injector handler code; duplicated for performance reasons
            if ($ok) {
                foreach ($this->injectors as $i => $x) {
-                    if (!$x->skip) $x->handleElement($token);
+                    if (!$x->skip) $this->injectors[$i]->handleElement($token);
                    if (is_array($token)) {
                        $this->currentInjector = $i;
                        break;
@ -202,6 +203,9 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            $current_parent = array_pop($this->currentNesting);
            if ($current_parent->name == $token->name) {
                $result[] = $token;
+                foreach ($this->injectors as $i => $x) {
+                    $this->injectors[$i]->notifyEnd($token);
+                }
                continue;
            }
            
@ -238,16 +242,16 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            
            // okay, we found it, close all the skipped tags
            // note that skipped tags contains the element we need closed
-            $size = count($skipped_tags);
-            for ($i = $size - 1; $i > 0; $i--) {
-                if ($e && !isset($skipped_tags[$i]->armor['MakeWellFormed_TagClosedError'])) {
+            for ($i = count($skipped_tags) - 1; $i >= 0; $i--) {
+                if ($i && $e && !isset($skipped_tags[$i]->armor['MakeWellFormed_TagClosedError'])) {
                    $e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by element end', $skipped_tags[$i]);
                }
-                $result[] = new HTMLPurifier_Token_End($skipped_tags[$i]->name);
+                $result[] = $new_token = new HTMLPurifier_Token_End($skipped_tags[$i]->name);
+                foreach ($this->injectors as $j => $x) { // $j, not $i!!!
+                    $this->injectors[$j]->notifyEnd($new_token);
+                }
            }
            
-            $result[] = new HTMLPurifier_Token_End($skipped_tags[$i]->name);
-            
        }
        
        $context->destroy('CurrentNesting');
@ -255,17 +259,18 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
        $context->destroy('InputIndex');
        $context->destroy('CurrentToken');
        
-        // we're at the end now, fix all still unclosed tags
-        // not using processToken() because at this point we don't
-        // care about current nesting
+        // we're at the end now, fix all still unclosed tags (this is
+        // duplicated from the end of the loop with some slight modifications)
+        // not using $skipped_tags since it would invariably be all of them
        if (!empty($this->currentNesting)) {
-            $size = count($this->currentNesting);
-            for ($i = $size - 1; $i >= 0; $i--) {
+            for ($i = count($this->currentNesting) - 1; $i >= 0; $i--) {
                if ($e && !isset($this->currentNesting[$i]->armor['MakeWellFormed_TagClosedError'])) {
                    $e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by document end', $this->currentNesting[$i]);
                }
-                $result[] =
-                    new HTMLPurifier_Token_End($this->currentNesting[$i]->name);
+                $result[] = $new_token = new HTMLPurifier_Token_End($this->currentNesting[$i]->name);
+                foreach ($this->injectors as $j => $x) { // $j, not $i!!!
+                    $this->injectors[$j]->notifyEnd($new_token);
+                }
            }
        }
        
@ -286,8 +291,14 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            
            // adjust the injector skips based on the array substitution
            if ($this->injectors) {
-                $offset = count($token) + 1;
+                $offset = count($token);
                for ($i = 0; $i <= $this->currentInjector; $i++) {
+                    // because of the skip back, we need to add one more
+                    // for uninitialized injectors. I'm not exactly
+                    // sure why this is the case, but I think it has to
+                    // do with the fact that we're decrementing skips
+                    // before re-checking text
+                    if (!$this->injectors[$i]->skip) $this->injectors[$i]->skip++;
                    $this->injectors[$i]->skip += $offset;
                }
            }
--- a/library/HTMLPurifier/Strategy/RemoveForeignElements.php
+++ b/library/HTMLPurifier/Strategy/RemoveForeignElements.php
@ -116,6 +116,7 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
                    // mostly everything's good, but
                    // we need to make sure required attributes are in order
                    if (
+                        ($token->type === 'start' || $token->type === 'empty') &&
                        $definition->info[$token->name]->required_attr &&
                        ($token->name != 'img' || $remove_invalid_img) // ensure config option still works
                    ) {
@ -134,7 +135,6 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
                        $token->armor['ValidateAttributes'] = true;
                    }
                    
-                    // CAN BE GENERICIZED
                    if (isset($hidden_elements[$token->name]) && $token->type == 'start') {
                        $textify_comments = $token->name;
                    } elseif ($token->name === $textify_comments && $token->type == 'end') {
--- a/library/HTMLPurifier/Strategy/ValidateAttributes.php
+++ b/library/HTMLPurifier/Strategy/ValidateAttributes.php
@ -6,10 +6,6 @@ require_once 'HTMLPurifier/IDAccumulator.php';

 require_once 'HTMLPurifier/AttrValidator.php';

-HTMLPurifier_ConfigSchema::define(
-    'Attr', 'IDBlacklist', array(), 'list',
-    'Array of IDs not allowed in the document.');
-
 /**
 * Validate all attributes in the tokens.
 */
@ -19,11 +15,6 @@ class HTMLPurifier_Strategy_ValidateAttributes extends HTMLPurifier_Strategy
    
    function execute($tokens, $config, &$context) {
        
-        // setup id_accumulator context
-        $id_accumulator = new HTMLPurifier_IDAccumulator();
-        $id_accumulator->load($config->get('Attr', 'IDBlacklist'));
-        $context->register('IDAccumulator', $id_accumulator);
-        
        // setup validator
        $validator = new HTMLPurifier_AttrValidator();
        
@ -44,8 +35,6 @@ class HTMLPurifier_Strategy_ValidateAttributes extends HTMLPurifier_Strategy
            
            $tokens[$key] = $token; // for PHP 4
        }
-        
-        $context->destroy('IDAccumulator');
        $context->destroy('CurrentToken');
        
        return $tokens;
--- a/library/HTMLPurifier/URIFilter.php
+++ b/library/HTMLPurifier/URIFilter.php
@ -1,10 +1,22 @@
 <?php

 /**
- * Chainable filters for custom URI processing 
+ * Chainable filters for custom URI processing.
+ * 
+ * These filters can perform custom actions on a URI filter object,
+ * including transformation or blacklisting.
+ * 
+ * @warning This filter is called before scheme object validation occurs.
+ *          Make sure, if you require a specific scheme object, you
+ *          you check that it exists. This allows filters to convert
+ *          proprietary URI schemes into regular ones.
 */
 class HTMLPurifier_URIFilter
 {
+    
+    /**
+     * Unique identifier of filter
+     */
    var $name;
    
    /**
@ -17,8 +29,12 @@ class HTMLPurifier_URIFilter
     * @param &$uri Reference to URI object
     * @param $config Instance of HTMLPurifier_Config
     * @param &$context Instance of HTMLPurifier_Context
+     * @return bool Whether or not to continue processing: false indicates
+     *         URL is no good, true indicates continue processing. Note that
+     *         all changes are committed directly on the URI object
     */
    function filter(&$uri, $config, &$context) {
        trigger_error('Cannot call abstract function', E_USER_ERROR);
    }
+    
 }
--- a/library/HTMLPurifier/URIFilter/MakeAbsolute.php
+++ b/library/HTMLPurifier/URIFilter/MakeAbsolute.php
@ -47,6 +47,10 @@ class HTMLPurifier_URIFilter_MakeAbsolute extends HTMLPurifier_URIFilter
            // absolute URI already: don't change
            if (!is_null($uri->host)) return true;
            $scheme_obj = $uri->getSchemeObj($config, $context);
+            if (!$scheme_obj) {
+                // scheme not recognized
+                return false;
+            }
            if (!$scheme_obj->hierarchical) {
                // non-hierarchal URI with explicit scheme, don't change
                return true;
--- a/maintenance/PH5P.patch
+++ b/maintenance/PH5P.patch
@ -1,5 +1,5 @@
--- old.php	2007-08-19 14:42:33.640625000 -0400
-+++ new.php	2007-08-19 14:41:51.609375000 -0400
+--- C:\Users\Edward\Webs\htmlpurifier\maintenance\PH5P.php	2007-11-04 23:41:49.074543700 -0500
+++ C:\Users\Edward\Webs\htmlpurifier\maintenance/PH5P.new.php	2007-11-05 00:23:52.839543700 -0500
@@ -211,7 +211,10 @@
         // If nothing is returned, emit a U+0026 AMPERSAND character token.
         // Otherwise, emit the character token that was returned.
@ -43,3 +43,22 @@
                         $entity = $id;
                         break;
                     }
+@@ -3659,7 +3668,7 @@
+         }
+     }
+ 
+-    private function generateImpliedEndTags(array $exclude = array()) {
+    private function generateImpliedEndTags($exclude = array()) {
+         /* When the steps below require the UA to generate implied end tags,
+         then, if the current node is a dd element, a dt element, an li element,
+         a p element, a td element, a th  element, or a tr element, the UA must
+@@ -3673,7 +3682,8 @@
+         }
+     }
+ 
+-    private function getElementCategory($name) {
+    private function getElementCategory($node) {
+        $name = $node->tagName;
+         if(in_array($name, $this->special))
+             return self::SPECIAL;
+ 
--- a/maintenance/PH5P.php
+++ b/maintenance/PH5P.php
--- a/maintenance/flush-definition-cache.php
+++ b/maintenance/flush-definition-cache.php
@ -32,5 +32,5 @@ foreach ($names as $name) {
    $cache->flush($config);
 }

-echo 'Cache flushed successfully.';
+echo "Cache flushed successfully.\n";

--- a/maintenance/generate-ph5p-patch.php
+++ b/maintenance/generate-ph5p-patch.php
@ -0,0 +1,13 @@
+<?php
+
+$orig = realpath(dirname(__FILE__) . '/PH5P.php');
+$new  = realpath(dirname(__FILE__) . '/../library/HTMLPurifier/Lexer/PH5P.php');
+$newt = dirname(__FILE__) . '/PH5P.new.php'; // temporary file
+
+// minor text-processing of new file to get into same format as original
+$new_src = file_get_contents($new);
+$new_src = '<?php' . PHP_EOL . substr($new_src, strpos($new_src, 'class HTML5 {'));
+
+file_put_contents($newt, $new_src);
+shell_exec("diff -u \"$orig\" \"$newt\" > PH5P.patch");
+unlink($newt);
--- a/plugins/phorum/htmlpurifier.php
+++ b/plugins/phorum/htmlpurifier.php
@ -261,12 +261,42 @@ function phorum_htmlpurifier_editor_after_subject() {
    // don't show this message if it's a WYSIWYG editor, since it will
    // then be handled automatically
    if (!empty($GLOBALS['PHORUM']['mod_htmlpurifier']['wysiwyg'])) return;
-    ?><tr><td colspan="2" style="padding:1em 0.3em;">
-  HTML input is <strong>on</strong>. Make sure you escape all HTML and
-  angled-brackets with &amp;lt; and &amp;gt; (you can also use CDATA
-  tags, simply wrap the suspect text with
-&lt;![CDATA[<em>text</em>]]&gt;. Paragraphs will only be applied to 
-double-spaces; single-spaces will not generate <tt>&lt;br&gt;</tt> tags.
+    ?><tr><td colspan="2" style="padding:1em 0.3em;" class="htmlpurifier-help">
+    <p>
+        <strong>HTML input</strong> is enabled. Make sure you escape all HTML and
+        angled brackets with <code>&amp;lt;</code> and <code>&amp;gt;</code>.
+    </p><?php
+            $purifier =& HTMLPurifier::getInstance();
+            $config = $purifier->config;
+            if ($config->get('AutoFormat', 'AutoParagraph')) {
+                ?><p>
+                    <strong>Auto-paragraphing</strong> is enabled. Double
+                    newlines will be converted to paragraphs; for single
+                    newlines, use the <code>pre</code> tag.
+                </p><?php
+            }
+            $html_definition = $config->getDefinition('HTML');
+            $allowed = array();
+            foreach ($html_definition->info as $name => $x) $allowed[] = "<code>$name</code>";
+            sort($allowed);
+            $allowed_text = implode(', ', $allowed);
+            ?><p><strong>Allowed tags:</strong> <?php
+            echo $allowed_text;
+            ?>.</p><?php
+        ?>
+    </p>
+    <p>
+        For inputting literal code such as HTML and PHP for display, use
+        CDATA tags to auto-escape your angled brackets, and <code>pre</code>
+        to preserve newlines:
+    </p>
+    <pre>&lt;pre&gt;&lt;![CDATA[
+<em>Place code here</em>
+]]&gt;&lt;/pre&gt;</pre>
+    <p>
+        Power users, you can hide this notice with:
+        <pre>.htmlpurifier-help {display:none;}</pre>
+    </p>
    </td></tr><?php
 }

--- a/plugins/phorum/settings/migrate-sigs.php
+++ b/plugins/phorum/settings/migrate-sigs.php
@ -20,8 +20,10 @@ function phorum_htmlpurifier_migrate_sigs_check() {
 function phorum_htmlpurifier_migrate_sigs($offset) {
    global $PHORUM;
    
-    if(!$offset) return; // bail out quick of $offset == 0
+    if(!$offset) return; // bail out quick if $offset == 0
    
+    // theoretically, we could get rid of this multi-request
+    // doo-hickery if safe mode is off
    @set_time_limit(0); // attempt to let this run
    $increment = $PHORUM['mod_htmlpurifier']['migrate-sigs-increment'];
    
@ -52,21 +54,19 @@ function phorum_htmlpurifier_migrate_sigs($offset) {
    
    // query for highest ID in database
    $type = $PHORUM['DBCONFIG']['type'];
+    $sql = "select MAX(user_id) from {$PHORUM['user_table']}";
    if ($type == 'mysql') {
        $conn = phorum_db_mysql_connect();
-        $sql = "select MAX(user_id) from {$PHORUM['user_table']}";
        $res = mysql_query($sql, $conn);
        $row = mysql_fetch_row($res);
-        $top_id = (int) $row[0];
    } elseif ($type == 'mysqli') {
        $conn = phorum_db_mysqli_connect();
-        $sql = "select MAX(user_id) from {$PHORUM['user_table']}";
        $res = mysqli_query($conn, $sql);
        $row = mysqli_fetch_row($res);
-        $top_id = (int) $row[0];
    } else {
        exit('Unrecognized database!');
    }
+    $top_id = (int) $row[0];
    
    $offset += $increment;
    if ($offset > $top_id) { // test for end condition
--- a/tests/HTMLPurifier/ChildDef/OptionalTest.php
+++ b/tests/HTMLPurifier/ChildDef/OptionalTest.php
@ -19,5 +19,9 @@ class HTMLPurifier_ChildDef_OptionalTest extends HTMLPurifier_ChildDefHarness
        $this->assertResult('Not allowed text', '');
    }
    
+    function testEmpty() {
+        $this->assertResult('');
+    }
+    
 }

--- a/tests/HTMLPurifier/ChildDef/StrictBlockquoteTest.php
+++ b/tests/HTMLPurifier/ChildDef/StrictBlockquoteTest.php
@ -74,10 +74,11 @@ extends HTMLPurifier_ChildDefHarness
    }
    
    function testError() {
-        $this->expectError('Cannot use non-block element as block wrapper');
+        // $this->expectError('Cannot use non-block element as block wrapper');
        $this->obj = new HTMLPurifier_ChildDef_StrictBlockquote('div | p');
        $this->config->set('HTML', 'BlockWrapper', 'dav');
        $this->assertResult('Needs wrap', '<p>Needs wrap</p>');
+        $this->swallowErrors();
    }
    
 }
--- a/tests/HTMLPurifier/ErrorCollectorEMock.php
+++ b/tests/HTMLPurifier/ErrorCollectorEMock.php
@ -27,11 +27,20 @@ class HTMLPurifier_ErrorCollectorEMock extends HTMLPurifier_ErrorCollectorMock
    
    function send($severity, $msg) {
        // test for context
-        $test = &$this->_getCurrentTestCase();
+        $context =& SimpleTest::getContext();
+        $test =& $context->getTest();
+        
+        // compat
+        if (empty($this->_mock)) {
+            $mock =& $this;
+        } else {
+            $mock =& $this->_mock;
+        }
+        
        foreach ($this->_expected_context as $key => $value) {
            $test->assertEqual($value, $this->_context->get($key));
        }
-        $step = $this->getCallCount('send');
+        $step = $mock->getCallCount('send');
        if (isset($this->_expected_context_at[$step])) {
            foreach ($this->_expected_context_at[$step] as $key => $value) {
                $test->assertEqual($value, $this->_context->get($key));
@ -39,7 +48,7 @@ class HTMLPurifier_ErrorCollectorEMock extends HTMLPurifier_ErrorCollectorMock
        }
        // boilerplate mock code, does not have return value or references
        $args = func_get_args();
-        $this->_invoke('send', $args);
+        $mock->_invoke('send', $args);
    }
    
 }
--- a/tests/HTMLPurifier/ErrorsHarness.php
+++ b/tests/HTMLPurifier/ErrorsHarness.php
@ -3,11 +3,15 @@
 require_once 'HTMLPurifier/ErrorCollectorEMock.php';
 require_once 'HTMLPurifier/Lexer/DirectLex.php';

+/**
+ * @todo Make the callCount variable actually work, so we can precisely
+ *       specify what errors we want: no more, no less
+ */
 class HTMLPurifier_ErrorsHarness extends HTMLPurifier_Harness
 {
    
    var $config, $context;
-    var $collector, $generator;
+    var $collector, $generator, $callCount;
    
    function setup() {
        $this->config = HTMLPurifier_Config::create(array('Core.CollectErrors' => true));
@ -16,6 +20,11 @@ class HTMLPurifier_ErrorsHarness extends HTMLPurifier_Harness
        $this->collector = new HTMLPurifier_ErrorCollectorEMock();
        $this->collector->prepare($this->context);
        $this->context->register('ErrorCollector', $this->collector);
+        $this->callCount = 0;
+    }
+    
+    function expectNoErrorCollection() {
+        $this->collector->expectNever('send');
    }
    
    function expectErrorCollection() {
--- a/tests/HTMLPurifier/IDAccumulatorTest.php
+++ b/tests/HTMLPurifier/IDAccumulatorTest.php
@ -30,5 +30,11 @@ class HTMLPurifier_IDAccumulatorTest extends HTMLPurifier_Harness
        
    }
    
+    function testBuild() {
+        $this->config->set('Attr', 'IDBlacklist', array('foo'));
+        $accumulator = HTMLPurifier_IDAccumulator::build($this->config, $this->context);
+        $this->assertTrue( isset($accumulator->ids['foo']) );
+    }
+    
 }

--- a/tests/HTMLPurifier/Injector/AutoParagraphTest.php
+++ b/tests/HTMLPurifier/Injector/AutoParagraphTest.php
@ -194,10 +194,7 @@ Bar</p></div>',
    }
    
    function testNoParagraphSingleInlineNodeInBlockNode() {
-        $this->assertResult(
-'<div><b>Foo</b></div>',
-            '<div><b>Foo</b></div>'
-        );
+        $this->assertResult( '<div><b>Foo</b></div>' );
    }
    
    function testParagraphInBlockquote() {
@ -277,9 +274,7 @@ Par1
    function testBlockNodeTextDelimeterWithoutDoublespaceInBlockNode() {
        $this->assertResult(
 '<div>Par1
-<div>Par2</div></div>',
-'<div><p>Par1
-</p><div>Par2</div></div>'
+<div>Par2</div></div>'
        );
    }
    
@ -351,6 +346,30 @@ Par2'
        );
    }
    
+    function testInlineAndBlockTagInDivNoParagraph() {
+        $this->assertResult(
+            '<div><code>bar</code> mmm <pre>asdf</pre></div>'
+        );
+    }
+    
+    function testInlineAndBlockTagInDivNeedingParagraph() {
+        $this->assertResult(
+'<div><code>bar</code> mmm
+
+<pre>asdf</pre></div>',
+'<div><p><code>bar</code> mmm</p><pre>asdf</pre></div>'
+        );
+    }
+    
+    function testTextInlineNodeTextThenDoubleNewlineNeedsParagraph() {
+        $this->assertResult(
+'<div>asdf <code>bar</code> mmm
+
+<pre>asdf</pre></div>',
+'<div><p>asdf <code>bar</code> mmm</p><pre>asdf</pre></div>'
+        );
+    }
+    
    function testErrorNeeded() {
        $this->config->set('HTML', 'Allowed', 'b');
        $this->expectError('Cannot enable AutoParagraph injector because p is not allowed');
--- a/tests/HTMLPurifier/Strategy/FixNestingTest.php
+++ b/tests/HTMLPurifier/Strategy/FixNestingTest.php
@ -109,8 +109,9 @@ class HTMLPurifier_Strategy_FixNestingTest extends HTMLPurifier_StrategyHarness
   function testInvalidParentError() {
        // test fallback to div
        $this->config->set('HTML', 'Parent', 'obviously-impossible');
-        $this->expectError('Cannot use unrecognized element as parent');
+        // $this->expectError('Cannot use unrecognized element as parent');
        $this->assertResult('<div>Accept</div>');
+        $this->swallowErrors();
    }
    
    function testCascadingRemovalOfNodesMissingRequiredChildren() {
@ -129,5 +130,10 @@ class HTMLPurifier_Strategy_FixNestingTest extends HTMLPurifier_StrategyHarness
        $this->assertResult('<table></table><table></table>', '');
    }
    
+    function testStrictBlockquoteInHTML401() {
+        $this->config->set('HTML', 'Doctype', 'HTML 4.01 Strict');
+        $this->assertResult('<blockquote>text</blockquote>', '<blockquote><p>text</p></blockquote>');
+    }
+    
 }

--- a/tests/HTMLPurifier/Strategy/FixNesting_ErrorsTest.php
+++ b/tests/HTMLPurifier/Strategy/FixNesting_ErrorsTest.php
@ -28,6 +28,11 @@ class HTMLPurifier_Strategy_FixNesting_ErrorsTest extends HTMLPurifier_Strategy_
        $this->invoke("<span>Valid<div>Invalid</div></span>");
    }
    
+    function testNoNodeReorganizedForEmptyNode() {
+        $this->expectNoErrorCollection();
+        $this->invoke("<span></span>");
+    }
+    
    function testNodeContentsRemoved() {
        $this->expectErrorCollection(E_ERROR, 'Strategy_FixNesting: Node contents removed');
        $this->expectContext('CurrentToken', new HTMLPurifier_Token_Start('span', array(), 1));
--- a/tests/HTMLPurifier/Strategy/MakeWellFormed_InjectorTest.php
+++ b/tests/HTMLPurifier/Strategy/MakeWellFormed_InjectorTest.php
@ -11,6 +11,19 @@ class HTMLPurifier_Strategy_MakeWellFormed_InjectorTest extends HTMLPurifier_Str
        $this->obj = new HTMLPurifier_Strategy_MakeWellFormed();
        $this->config->set('AutoFormat', 'AutoParagraph', true);
        $this->config->set('AutoFormat', 'Linkify', true);
+        generate_mock_once('HTMLPurifier_Injector');
+    }
+    
+    function testEndNotification() {
+        $mock = new HTMLPurifier_InjectorMock();
+        $mock->skip = false;
+        $mock->expectAt(0, 'notifyEnd', array(new HTMLPurifier_Token_End('b')));
+        $mock->expectAt(1, 'notifyEnd', array(new HTMLPurifier_Token_End('i')));
+        $mock->expectCallCount('notifyEnd', 2);
+        $this->config->set('AutoFormat', 'AutoParagraph', false);
+        $this->config->set('AutoFormat', 'Linkify',       false);
+        $this->config->set('AutoFormat', 'Custom', array($mock));
+        $this->assertResult('<i><b>asdf</b>', '<i><b>asdf</b></i>');
    }
    
    function testOnlyAutoParagraph() {
@ -62,4 +75,11 @@ class HTMLPurifier_Strategy_MakeWellFormed_InjectorTest extends HTMLPurifier_Str
        );
    }
    
+    function testParagraphAfterLinkifiedURL() {
+        $this->assertResult(
+            "http://google.com\n\n<b>b</b>",
+            "<p><a href=\"http://google.com\">http://google.com</a></p><p><b>b</b></p>"
+        );
+    }
+    
 }
--- a/tests/HTMLPurifier/Strategy/RemoveForeignElementsTest.php
+++ b/tests/HTMLPurifier/Strategy/RemoveForeignElementsTest.php
@ -82,5 +82,14 @@ alert(&lt;b&gt;bold&lt;/b&gt;);
        );
    }
    
+    function testRequiredAttributesTestNotPerformedOnEndTag() {
+        $this->config->set('HTML', 'DefinitionID',
+            'HTMLPurifier_Strategy_RemoveForeignElementsTest'.
+            '->testRequiredAttributesTestNotPerformedOnEndTag');
+        $def =& $this->config->getHTMLDefinition(true);
+        $def->addElement('f', 'Block', 'Optional: #PCDATA', false, array('req*' => 'Text'));
+        $this->assertResult('<f req="text">Foo</f> Bar');
+    }
+    
 }

--- a/tests/HTMLPurifier/URIFilter/MakeAbsoluteTest.php
+++ b/tests/HTMLPurifier/URIFilter/MakeAbsoluteTest.php
@ -111,6 +111,12 @@ class HTMLPurifier_URIFilter_MakeAbsoluteTest extends HTMLPurifier_URIFilterHarn
        $this->assertFiltering('.', '../');
    }
    
+    function testRemoveJavaScriptWithEmbeddedLink() {
+        // credits: NykO18
+        $this->setBase('http://www.example.com/');
+        $this->assertFiltering('javascript: window.location = \'http://www.example.com\';', false);
+    }
+    
    // error case
    
    function testErrorNoBase() {
--- a/tests/HTMLPurifierTest.php
+++ b/tests/HTMLPurifierTest.php
@ -94,6 +94,7 @@ class HTMLPurifierTest extends HTMLPurifier_Harness
        
        $this->purifier = new HTMLPurifier(array('HTML.EnableAttrID' => true));
        $this->assertPurification('<span id="moon">foobar</span>');
+        $this->assertPurification('<img id="folly" src="folly.png" alt="Omigosh!" />');
        
    }
    
--- a/tests/index.php
+++ b/tests/index.php
@ -3,7 +3,9 @@
 // call one file using /?f=FileTest.php , see $test_files array for
 // valid values

-error_reporting(E_ALL | E_STRICT);
+if (version_compare(PHP_VERSION, '5.1', '>=')) error_reporting(E_ALL | E_STRICT);
+else error_reporting(E_ALL);
+
 define('HTMLPurifierTest', 1);
 define('HTMLPURIFIER_SCHEMA_STRICT', true); // validate schemas

@ -17,6 +19,7 @@ $GLOBALS['HTMLPurifierTest']['PH5P'] = version_compare(PHP_VERSION, "5", ">=") &
 $simpletest_location = 'simpletest/'; // reasonable guess

 // load SimpleTest
+if (file_exists('../conf/test-settings.php')) include '../conf/test-settings.php';
 if (file_exists('../test-settings.php')) include '../test-settings.php';
 require_once $simpletest_location . 'unit_tester.php';
 require_once $simpletest_location . 'reporter.php';
@ -79,7 +82,7 @@ if ($test_file = $GLOBALS['HTMLPurifierTest']['File']) {
    
 } else {
    
-    $test = new GroupTest('All Tests');
+    $test = new GroupTest('All tests on PHP ' . PHP_VERSION);
    foreach ($test_files as $test_file) {
        require_once $test_file;
        $test->addTestClass(path2class($test_file));
@ -91,5 +94,3 @@ if (SimpleReporter::inCli()) $reporter = new TextReporter();
 else $reporter = new HTMLPurifier_SimpleTest_Reporter('UTF-8');

 $test->run($reporter);
-
-
--- a/tests/multitest.php
+++ b/tests/multitest.php
@ -0,0 +1,33 @@
+<?php
+
+$versions_to_test = array(
+    'FLUSH',
+    '5.0.4',
+    '5.0.5',
+    '5.1.4',
+    '5.1.6',
+    '5.2.0',
+    '5.2.1',
+    '5.2.2',
+    '5.2.3',
+    '5.2.4',
+    '5.2.5RC2-dev',
+    '5.3.0-dev',
+    // '6.0.0-dev',
+);
+
+echo str_repeat('-', 70) . "\n";
+echo "HTML Purifier\n";
+echo "Multiple PHP Versions Test\n\n";
+
+passthru("php ../maintenance/merge-library.php");
+
+foreach ($versions_to_test as $version) {
+    if ($version === 'FLUSH') {
+        shell_exec('php ../maintenance/flush-definition-cache.php');
+        continue;
+    }
+    passthru("phpv $version index.php");
+    passthru("phpv $version index.php standalone");
+    echo "\n\n";
+}
 @ -1 +1 @@
 .1.2
 .1.3