0
0
mirror of https://github.com/ezyang/htmlpurifier.git synced 2025-03-12 01:28:44 +00:00

Release 2.1.2, merged in 1368 to HEAD.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/strict@1404 48356398-32a2-884e-a903-53898d9a118a
This commit is contained in:
Edward Z. Yang 2007-09-03 15:40:43 +00:00
parent 80c60bb9b5
commit b3f0e6c86c
72 changed files with 6233 additions and 1405 deletions

@ -4,7 +4,7 @@
# Project related configuration options # Project related configuration options
#--------------------------------------------------------------------------- #---------------------------------------------------------------------------
PROJECT_NAME = HTML Purifier PROJECT_NAME = HTML Purifier
PROJECT_NUMBER = 2.1.1 PROJECT_NUMBER = 2.1.2
OUTPUT_DIRECTORY = "C:/Documents and Settings/Edward/My Documents/My Webs/htmlpurifier/docs/doxygen" OUTPUT_DIRECTORY = "C:/Documents and Settings/Edward/My Documents/My Webs/htmlpurifier/docs/doxygen"
CREATE_SUBDIRS = NO CREATE_SUBDIRS = NO
OUTPUT_LANGUAGE = English OUTPUT_LANGUAGE = English

28
NEWS

@ -9,6 +9,34 @@ NEWS ( CHANGELOG and HISTORY ) HTMLPurifier
. Internal change . Internal change
========================== ==========================
2.1.2, released 2007-09-03
! Implemented Object module for trusted users
! Implemented experimental HTML5 parsing mode using PH5P. To use, add
this to your code:
require_once 'HTMLPurifier/Lexer/PH5P.php';
$config->set('Core', 'LexerImpl', 'PH5P');
Note that this Lexer introduces some classes not in the HTMLPurifier
namespace. Also, this is PHP5 only.
! CSS property border-spacing implemented
- Fix non-visible parsing error in DirectLex with empty tags that have
slashes inside attribute values.
- Fix typo in CSS definition: border-collapse:seperate; was incorrectly
accepted as valid CSS. Usually non-visible, because this styling is the
default for tables in most browsers. Thanks Brett Zamir for pointing
this out.
- Fix validation errors in configuration form
- Hammer out a bunch of edge-case bugs in the standalone distribution
- Inclusion reflection removed from URISchemeRegistry; you must manually
include any new schema files you wish to use
- Numerous typo fixes in documentation thanks to Brett Zamir
. Unit test refactoring for one logical test per test function
. Config and context parameters in ComplexHarness deprecated: instead, edit
the $config and $context member variables
. HTML wrapper in DOMLex now takes DTD identifiers into account; doesn't
really make a difference, but is good for completeness sake
. merge-library.php script refactored for greater code reusability and
PHP4 compatibility
2.1.1, released 2007-08-04 2.1.1, released 2007-08-04
- Fix show-stopper bug in %URI.MakeAbsolute functionality - Fix show-stopper bug in %URI.MakeAbsolute functionality
- Fix PHP4 syntax error in standalone version - Fix PHP4 syntax error in standalone version

15
TODO

@ -28,23 +28,22 @@ afraid to cast your vote for the next feature to be implemented!
- Remove empty inline tags<i></i> - Remove empty inline tags<i></i>
- Append something to duplicate IDs so they're still usable (impl. note: the - Append something to duplicate IDs so they're still usable (impl. note: the
dupe detector would also need to detect the suffix as well) dupe detector would also need to detect the suffix as well)
- Externalize inline CSS to promote clean HTML
2.4 release [It's All About Trust] (floating) 2.4 release [It's All About Trust] (floating)
# Implement untrusted, dangerous elements/attributes # Implement untrusted, dangerous elements/attributes
# Implement IDREF support (harder than it seems, since you cannot have # Implement IDREF support (harder than it seems, since you cannot have
IDREFs to non-existent IDs) IDREFs to non-existent IDs)
# Frameset XHTML 1.0 and HTML 4.01 doctypes
3.0 release [Beyond HTML] 3.0 release [Beyond HTML]
# Legit token based CSS parsing (will require revamping almost every # Legit token based CSS parsing (will require revamping almost every
AttrDef class) AttrDef class). Probably will use CSSTidy class
# More control over allowed CSS properties (maybe modularize it in the # More control over allowed CSS properties (maybe modularize it in the
same fashion!) same fashion!)
# Formatters for plaintext # Formatters for plaintext
- Smileys - Smileys
- Standardize token armor for all areas of processing - Standardize token armor for all areas of processing
- Fixes for Firefox's inability to handle COL alignment props (Bug 915)
- Automatically add non-breaking spaces to empty table cells when
empty-cells:show is applied to have compatibility with Internet Explorer
- Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand. - Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand.
Also, enable disabling of directionality Also, enable disabling of directionality
@ -63,25 +62,27 @@ Ongoing
- Complete basic smoketests - Complete basic smoketests
Unknown release (on a scratch-an-itch basis) Unknown release (on a scratch-an-itch basis)
? Semi-lossy dumb alternate character encoding transfor # CHMOD install script for PEAR installs
? Have 'lang' attribute be checked against official lists, achieved by ? Have 'lang' attribute be checked against official lists, achieved by
encoding all characters that have string entity equivalents encoding all characters that have string entity equivalents
- Abstract ChildDef_BlockQuote to work with all elements that only - Abstract ChildDef_BlockQuote to work with all elements that only
allow blocks in them, required or optional allow blocks in them, required or optional
- Reorganize Unit Tests - Reorganize Unit Tests
- Refactor loop tests: Lexer
- Reorganize configuration directives (Create more namespaces! Get messy!) - Reorganize configuration directives (Create more namespaces! Get messy!)
- Advanced URI filtering schemes (see docs/proposal-new-directives.txt) - Advanced URI filtering schemes (see docs/proposal-new-directives.txt)
- Implement lenient <ruby> child validation - Implement lenient <ruby> child validation
- Explain how to use HTML Purifier in non-PHP languages / create - Explain how to use HTML Purifier in non-PHP languages / create
a simple command line stub (or complicated?) a simple command line stub (or complicated?)
- Fixes for Firefox's inability to handle COL alignment props (Bug 915)
- Automatically add non-breaking spaces to empty table cells when
empty-cells:show is applied to have compatibility with Internet Explorer
Requested Requested
Wontfix Wontfix
- Non-lossy smart alternate character encoding transformations (unless - Non-lossy smart alternate character encoding transformations (unless
patch provided) patch provided)
- Pretty-printing HTML, users can use Tidy on the output on entire page - Pretty-printing HTML: users can use Tidy on the output on entire page
- Native content compression, whitespace stripping (don't rely on Tidy, make - Native content compression, whitespace stripping (don't rely on Tidy, make
sure we don't remove from <pre> or related tags): use gzip if this is sure we don't remove from <pre> or related tags): use gzip if this is
really important really important

@ -1 +1 @@
2.1.1 2.1.2

@ -1,10 +1,8 @@
In version 2.1, HTML Purifier's URI validation and filtering handling Version 2.1.2 is a mix of experimental features and stability updates.
system has been revamped with a new, extensible URIFilter system. Also Among new features: an Object module for trusted users, support for the
notable features include preservation of emoticons in PHP5 with CSS property 'border-spacing', and HTML 5 style parsing using PH5P.
%Core.AggressivelyFixLt, standalone and lite download versions, Bug fixes ihave resolved a few obscure issues including border-collapse:seperate,
transforming relative URIs to absolute URIs, Ruby in XHTML 1.1, a Phorum a DirectLex parsing error, broken HTML in printDefinition.php, and problems
mod, and UTF-8 font names. Notable bug-fixes include refinement of with the experimental standalone distribution. Also, there were large
the auto-paragraphing algorithm (no longer experimental), better XHTML amounts of behind-the-scenes refactoring and the removal of URIScheme
1.1 support and the removal of the contents of <style> elements. Version inclusion reflection.
2.1.1 amends a few bugs in some of newly introduced features, namely
running the standalone download version in PHP4 and %URI.MakeAbsolute.

@ -39,7 +39,7 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}
<table cellspacing="0"><tbody> <table cellspacing="0"><tbody>
<tr><td class="impl-yes">Implemented</td></tr> <tr><td class="impl-yes">Implemented</td></tr>
<tr><td class="impl-partial">Partially implemented</td></tr> <tr><td class="impl-partial">Partially implemented</td></tr>
<tr><td class="impl-no">Will not implement</td></tr> <tr><td class="impl-no">Not priority to implement</td></tr>
<tr><td class="danger">Dangerous attribute/property</td></tr> <tr><td class="danger">Dangerous attribute/property</td></tr>
<tr><td class="css1">Present in CSS1</td></tr> <tr><td class="css1">Present in CSS1</td></tr>
<tr><td class="feature">Feature, requires extra work</td></tr> <tr><td class="feature">Feature, requires extra work</td></tr>
@ -118,6 +118,7 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}
<tbody> <tbody>
<tr><th colspan="2">Table</th></tr> <tr><th colspan="2">Table</th></tr>
<tr class="impl-yes"><td>border-collapse</td><td>ENUM(collapse, seperate)</td></tr> <tr class="impl-yes"><td>border-collapse</td><td>ENUM(collapse, seperate)</td></tr>
<tr class="impl-yes"><td>border-space</td><td>MULTIPLE</td></tr>
<tr class="impl-yes"><td>caption-side</td><td>ENUM(top, bottom)</td></tr> <tr class="impl-yes"><td>caption-side</td><td>ENUM(top, bottom)</td></tr>
<tr class="feature"><td>empty-cells</td><td>ENUM(show, hide), No IE support makes this useless, <tr class="feature"><td>empty-cells</td><td>ENUM(show, hide), No IE support makes this useless,
possible fix with &amp;nbsp;? Unknown release milestone.</td></tr> possible fix with &amp;nbsp;? Unknown release milestone.</td></tr>

@ -32,7 +32,7 @@
Before we even write any code, it is paramount to consider whether or Before we even write any code, it is paramount to consider whether or
not the code we're writing is necessary or not. HTML Purifier, by default, not the code we're writing is necessary or not. HTML Purifier, by default,
contains a large set of elements and attributes: large enough so that contains a large set of elements and attributes: large enough so that
<em>any</em> element or attribute in XHTML 1.0 (and its HTML variant) <em>any</em> element or attribute in XHTML 1.0 or 1.1 (and its HTML variants)
that can be safely used by the general public is implemented. that can be safely used by the general public is implemented.
</p> </p>
@ -76,11 +76,12 @@
<h3>XHTML 1.1</h3> <h3>XHTML 1.1</h3>
<p> <p>
We have not implemented the As of HTMLPurifier 2.1.0, we have implemented the
<a href="http://www.w3.org/TR/2001/REC-ruby-20010531/">Ruby module</a>, <a href="http://www.w3.org/TR/2001/REC-ruby-20010531/">Ruby module</a>,
which defines a set of tags which defines a set of tags
for publishing short annotations for text, used mostly in Japanese for publishing short annotations for text, used mostly in Japanese
and Chinese school texts. and Chinese school texts, but applicable for positioning any text (not
limited to translations) above or below other corresponding text.
</p> </p>
<h3>XHTML 2.0</h3> <h3>XHTML 2.0</h3>
@ -492,10 +493,11 @@ $def =& $config->getHTMLDefinition(true);
<p> <p>
The <code>(%flow;)*</code> indicates the allowed children of the The <code>(%flow;)*</code> indicates the allowed children of the
<code>li</code> tag: <code>li</code> allows any number of flow <code>li</code> tag: <code>li</code> allows any number of flow
elements as its children. In HTML Purifier, we'd write it like elements as its children. (The <code>- O</code> allows the closing tag to be
<code>Flow</code> (here's where the content sets we were omitted, though in XML this is not allowed.) In HTML Purifier,
discussing earlier come into play). There are three shorthand content models you we'd write it like <code>Flow</code> (here's where the content sets
can specify: we were discussing earlier come into play). There are three shorthand
content models you can specify:
</p> </p>
<table class="table"> <table class="table">
@ -668,12 +670,22 @@ $def =& $config->getHTMLDefinition(true);
Common is a combination of the above-mentioned collections. Common is a combination of the above-mentioned collections.
</p> </p>
<p class="aside">
Readers familiar with the modularization may have noticed that the Core
attribute collection differs from that specified by the <a
href="http://www.w3.org/TR/xhtml-modularization/abstract_modules.html#s_commonatts">abstract
modules of the XHTML Modularization 1.1</a>. We believe this section
to be in error, as <code>br</code> permits the use of the <code>style</code>
attribute even though it uses the <code>Core</code> collection, and
the DTD and XML Schemas supplied by W3C support our interpretation.
</p>
<h3>Attributes</h3> <h3>Attributes</h3>
<p> <p>
If you didn't read the <a href="#addAttribute">previous section on If you didn't read the <a href="#addAttribute">earlier section on
adding attributes</a>, read it now. The last parameter is simply adding attributes</a>, read it now. The last parameter is simply
array of attribute names to attribute implementations, in the exact an array of attribute names to attribute implementations, in the exact
same format as <code>addAttribute()</code>. same format as <code>addAttribute()</code>.
</p> </p>

@ -58,7 +58,7 @@ appear elsewhere on the document. The method is simple:</p>
<pre>$config->set('HTML', 'EnableAttrID', true); <pre>$config->set('HTML', 'EnableAttrID', true);
$config->set('Attr', 'IDBlacklist' array( $config->set('Attr', 'IDBlacklist' array(
'list', 'of', 'attributes', 'that', 'are', 'forbidden' 'list', 'of', 'attribute', 'values', 'that', 'are', 'forbidden'
));</pre> ));</pre>
<p>That being said, there are some notable drawbacks. First of all, you have to <p>That being said, there are some notable drawbacks. First of all, you have to
@ -71,9 +71,9 @@ to possible standards-compliance issues.</p>
<p>Furthermore, this position becomes untenable when a single web page must hold <p>Furthermore, this position becomes untenable when a single web page must hold
multiple portions of user-submitted content. Since there's obviously no way multiple portions of user-submitted content. Since there's obviously no way
to find out before-hand what IDs users will use, the blacklist is helpless. to find out before-hand what IDs users will use, the blacklist is helpless.
And even since HTML Purifier validates each segment seperately, perhaps doing And since HTML Purifier validates each segment separately, perhaps doing
so at different times, it would be extremely difficult to dynamically update so at different times, it would be extremely difficult to dynamically update
the blacklist inbetween runs.</p> the blacklist in between runs.</p>
<p>Finally, simply destroying the ID is extremely un-userfriendly behavior: after <p>Finally, simply destroying the ID is extremely un-userfriendly behavior: after
all, they might have simply specified a duplicate ID by accident.</p> all, they might have simply specified a duplicate ID by accident.</p>

@ -22,7 +22,7 @@ out:</p>
<p class="emphasis">This ain't HTML Tidy!</p> <p class="emphasis">This ain't HTML Tidy!</p>
<p>Rather, Tidy stands for a cool set of Tidy-inspired in HTML Purifier <p>Rather, Tidy stands for a cool set of Tidy-inspired features in HTML Purifier
that allows users to submit deprecated elements and attributes and get that allows users to submit deprecated elements and attributes and get
valid strict markup back. For example:</p> valid strict markup back. For example:</p>
@ -33,8 +33,8 @@ valid strict markup back. For example:</p>
<pre>&lt;div style=&quot;text-align:center;&quot;&gt;Centered&lt;/div&gt;</pre> <pre>&lt;div style=&quot;text-align:center;&quot;&gt;Centered&lt;/div&gt;</pre>
<p>...when this particular fix is run on the HTML. This tutorial will give <p>...when this particular fix is run on the HTML. This tutorial will give
you down the lowdown of what exactly HTML Purifier will do when Tidy you the lowdown of what exactly HTML Purifier will do when Tidy
is on, and how to fine tune this behavior. Once again, <strong>you do is on, and how to fine-tune this behavior. Once again, <strong>you do
not need Tidy installed on your PHP to use these features!</strong></p> not need Tidy installed on your PHP to use these features!</strong></p>
<h2>What does it do?</h2> <h2>What does it do?</h2>
@ -221,7 +221,7 @@ general syntax:</p>
<p>The lowdown is, quite frankly, HTML Purifier's default settings are <p>The lowdown is, quite frankly, HTML Purifier's default settings are
probably good enough. The next step is to bump the level up to heavy, probably good enough. The next step is to bump the level up to heavy,
and if that still doesn't satisfy your appetite, do some fine tuning. and if that still doesn't satisfy your appetite, do some fine-tuning.
Other than that, don't worry about it: this all works silently and Other than that, don't worry about it: this all works silently and
effectively in the background.</p> effectively in the background.</p>

@ -96,7 +96,7 @@ which can be a rewarding (but difficult) task.</p>
<h2 id="findcharset">Finding the real encoding</h2> <h2 id="findcharset">Finding the real encoding</h2>
<p>In the beginning, there was ASCII, and things were simple. But they <p>In the beginning, there was ASCII, and things were simple. But they
weren't good, for no one could write in Cryllic or Thai. So there weren't good, for no one could write in Cyrillic or Thai. So there
exploded a proliferation of character encodings to remedy the problem exploded a proliferation of character encodings to remedy the problem
by extending the characters ASCII could express. This ridiculously by extending the characters ASCII could express. This ridiculously
simplified version of the history of character encodings shows us that simplified version of the history of character encodings shows us that
@ -138,7 +138,7 @@ browser:</p>
<dd>View &gt; Encoding: bulleted item is unofficial name</dd> <dd>View &gt; Encoding: bulleted item is unofficial name</dd>
</dl> </dl>
<p>Internet Explorer won't give you the mime (i.e. useful/real) name of the <p>Internet Explorer won't give you the MIME (i.e. useful/real) name of the
character encoding, so you'll have to look it up using their description. character encoding, so you'll have to look it up using their description.
Some common ones:</p> Some common ones:</p>
@ -216,6 +216,12 @@ if your <code>META</code> tag claims that either:</p>
<h2 id="fixcharset">Fixing the encoding</h2> <h2 id="fixcharset">Fixing the encoding</h2>
<p class="aside">The advice given here is for pages being served as
vanilla <code>text/html</code>. Different practices must be used
for <code>application/xml</code> or <code>application/xml+xhtml</code>, see
<a href="http://www.w3.org/TR/2002/NOTE-xhtml-media-types-20020430/">W3C's
document on XHTML media types</a> for more information.</p>
<p>If your <code>META</code> encoding and your real encoding match, <p>If your <code>META</code> encoding and your real encoding match,
savvy! You can skip this section. If they don't...</p> savvy! You can skip this section. If they don't...</p>
@ -302,7 +308,8 @@ languages</a>. The appropriate code is:</p>
<p>...replacing UTF-8 with whatever your embedded encoding is. <p>...replacing UTF-8 with whatever your embedded encoding is.
This code must come before any output, so be careful about This code must come before any output, so be careful about
stray whitespace in your application.</p> stray whitespace in your application (i.e., any whitespace before
output excluding whitespace within &lt;?php ?&gt; tags).</p>
<h4 id="fixcharset-server-phpini">PHP ini directive</h4> <h4 id="fixcharset-server-phpini">PHP ini directive</h4>
@ -313,8 +320,8 @@ header call: <code><a href="http://php.net/ini.core#ini.default-charset">default
<p>...will also do the trick. If PHP is running as an Apache module (and <p>...will also do the trick. If PHP is running as an Apache module (and
not as FastCGI, consult not as FastCGI, consult
<a href="http://php.net/phpinfo">phpinfo</a>() for details), you can even use htaccess do apply this property <a href="http://php.net/phpinfo">phpinfo</a>() for details), you can even use htaccess to apply this property
globally:</p> across many PHP files:</p>
<pre><a href="http://php.net/configuration.changes#configuration.changes.apache">php_value</a> default_charset &quot;UTF-8&quot;</pre> <pre><a href="http://php.net/configuration.changes#configuration.changes.apache">php_value</a> default_charset &quot;UTF-8&quot;</pre>
@ -360,10 +367,11 @@ to send anything at all:</p>
<pre><a href="http://httpd.apache.org/docs/1.3/mod/core.html#adddefaultcharset">AddDefaultCharset</a> Off</pre> <pre><a href="http://httpd.apache.org/docs/1.3/mod/core.html#adddefaultcharset">AddDefaultCharset</a> Off</pre>
<p>...making your <code>META</code> tags the sole source of <p>...making your internal charset declaration (usually the <code>META</code> tags)
character encoding information. In these cases, it is the sole source of character encoding
<em>especially</em> important to make sure you have valid <code>META</code> information. In these cases, it is <em>especially</em> important to make
tags on your pages and all the text before them is ASCII.</p> sure you have valid <code>META</code> tags on your pages and all the
text before them is ASCII.</p>
<blockquote class="aside"><p>These directives can also be <blockquote class="aside"><p>These directives can also be
placed in httpd.conf file for Apache, but placed in httpd.conf file for Apache, but
@ -428,28 +436,30 @@ IIS to change character encodings, I'd be grateful.</p>
<p><code>META</code> tags are the most common source of embedded <p><code>META</code> tags are the most common source of embedded
encodings, but they can also come from somewhere else: XML encodings, but they can also come from somewhere else: XML
processing instructions. They look like:</p> Declarations. They look like:</p>
<pre>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;</pre> <pre>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;</pre>
<p>...and are most often found in XML documents (including XHTML).</p> <p>...and are most often found in XML documents (including XHTML).</p>
<p>For XHTML, this processing instruction theoretically <p>For XHTML, this XML Declaration theoretically
overrides the <code>META</code> tag. In reality, this happens only when the overrides the <code>META</code> tag. In reality, this happens only when the
XHTML is actually served as legit XML and not HTML, which is almost always XHTML is actually served as legit XML and not HTML, which is almost always
never due to Internet Explorer's lack of support for never due to Internet Explorer's lack of support for
<code>application/xhtml+xml</code> (even though doing so is often <code>application/xhtml+xml</code> (even though doing so is often
argued to be <a href="http://www.hixie.ch/advocacy/xhtml">good practice</a>).</p> argued to be <a href="http://www.hixie.ch/advocacy/xhtml">good
practice</a> and is required by the XHTML 1.1 specification).</p>
<p>For XML, however, this processing instruction is extremely important. <p>For XML, however, this XML Declaration is extremely important.
Since most webservers are not configured to send charsets for .xml files, Since most webservers are not configured to send charsets for .xml files,
this is the only thing a parser has to go on. Furthermore, the default this is the only thing a parser has to go on. Furthermore, the default
for XML files is UTF-8, which often butts heads with more common for XML files is UTF-8, which often butts heads with more common
ISO-8859-1 encoding (you see this in garbled RSS feeds).</p> ISO-8859-1 encoding (you see this in garbled RSS feeds).</p>
<p>In short, if you use XHTML and have gone through the <p>In short, if you use XHTML and have gone through the
trouble of adding the XML header, make sure it jives trouble of adding the XML Declaration, make sure it jives
with your <code>META</code> tags and HTTP headers.</p> with your <code>META</code> tags (which should only be present
if served in text/html) and HTTP headers.</p>
<h3 id="fixcharset-internals">Inside the process</h3> <h3 id="fixcharset-internals">Inside the process</h3>
@ -506,7 +516,7 @@ usage in one language sometimes requires the occasional special character
that, without surprise, is not available in your character set. Sometimes that, without surprise, is not available in your character set. Sometimes
developers get around this by adding support for multiple encodings: when developers get around this by adding support for multiple encodings: when
using Chinese, use Big5, when using Japanese, use Shift-JIS, when using Chinese, use Big5, when using Japanese, use Shift-JIS, when
using Greek, etc. Other times, they use character entities with great using Greek, etc. Other times, they use character references with great
zeal.</p> zeal.</p>
<p>UTF-8, however, obviates the need for any of these complicated <p>UTF-8, however, obviates the need for any of these complicated
@ -520,14 +530,14 @@ you don't have to use those user-unfriendly entities.</p>
<p>Websites encoded in Latin-1 (ISO-8859-1) which ocassionally need <p>Websites encoded in Latin-1 (ISO-8859-1) which ocassionally need
a special character outside of their scope often will use a character a special character outside of their scope often will use a character
entity to achieve the desired effect. For instance, &theta; can be entity reference to achieve the desired effect. For instance, &theta; can be
written <code>&amp;theta;</code>, regardless of the character encoding's written <code>&amp;theta;</code>, regardless of the character encoding's
support of Greek letters.</p> support of Greek letters.</p>
<p>This works nicely for limited use of special characters, but <p>This works nicely for limited use of special characters, but
say you wanted this sentence of Chinese text: &#28608;&#20809;, say you wanted this sentence of Chinese text: &#28608;&#20809;,
&#36889;&#20841;&#20491;&#23383;&#26159;&#29978;&#40636;&#24847;&#24605;. &#36889;&#20841;&#20491;&#23383;&#26159;&#29978;&#40636;&#24847;&#24605;.
The entity-ized version would look like this:</p> The ampersand encoded version would look like this:</p>
<pre>&amp;#28608;&amp;#20809;, &amp;#36889;&amp;#20841;&amp;#20491;&amp;#23383;&amp;#26159;&amp;#29978;&amp;#40636;&amp;#24847;&amp;#24605;</pre> <pre>&amp;#28608;&amp;#20809;, &amp;#36889;&amp;#20841;&amp;#20491;&amp;#23383;&amp;#26159;&amp;#29978;&amp;#40636;&amp;#24847;&amp;#24605;</pre>
@ -545,7 +555,7 @@ an application that originally used ISO-8859-1 but switched to UTF-8
when it became far to cumbersome to support foreign languages. Bots when it became far to cumbersome to support foreign languages. Bots
will now actually go through articles and convert character entities will now actually go through articles and convert character entities
to their corresponding real characters for the sake of user-friendliness to their corresponding real characters for the sake of user-friendliness
and searcheability. See and searchability. See
<a href="http://meta.wikimedia.org/wiki/Help:Special_characters">Meta's <a href="http://meta.wikimedia.org/wiki/Help:Special_characters">Meta's
page on special characters</a> for more details. page on special characters</a> for more details.
</p></blockquote> </p></blockquote>
@ -593,7 +603,7 @@ browser you're using, they might:</p>
<ul> <ul>
<li>Replace the unsupported characters with useless question marks,</li> <li>Replace the unsupported characters with useless question marks,</li>
<li>Attempt to fix the characters (example: smart quotes to regular quotes),</li> <li>Attempt to fix the characters (example: smart quotes to regular quotes),</li>
<li>Replace the character with a character entity, or</li> <li>Replace the character with a character entity reference, or</li>
<li>Send it anyway as a different character encoding mixed in <li>Send it anyway as a different character encoding mixed in
with the original encoding (usually Windows-1252 rather than with the original encoding (usually Windows-1252 rather than
iso-8859-1 or UTF-8 interspersed in 8-bit)</li> iso-8859-1 or UTF-8 interspersed in 8-bit)</li>
@ -609,7 +619,7 @@ since UTF-8 supports every character.</p>
<h4 id="whyutf8-forms-multipart"><code>multipart/form-data</code></h4> <h4 id="whyutf8-forms-multipart"><code>multipart/form-data</code></h4>
<p>Multipart form submission takes a way a lot of the ambiguity <p>Multipart form submission takes away a lot of the ambiguity
that percent-encoding had: the server now can explicitly ask for that percent-encoding had: the server now can explicitly ask for
certain encodings, and the client can explicitly tell the server certain encodings, and the client can explicitly tell the server
during the form submission what encoding the fields are in.</p> during the form submission what encoding the fields are in.</p>
@ -622,9 +632,9 @@ Each method has deficiencies, especially the former.</p>
<p>If you tell the browser to send the form in the same encoding as <p>If you tell the browser to send the form in the same encoding as
the page, you still have the trouble of what to do with characters the page, you still have the trouble of what to do with characters
that are outside of the character encoding's range. The behavior, once that are outside of the character encoding's range. The behavior, once
again, varies: Firefox 2.0 entity-izes them while Internet Explorer again, varies: Firefox 2.0 converts them to character entity references
7.0 mangles them beyond intelligibility. For serious internationalization purposes, while Internet Explorer 7.0 mangles them beyond intelligibility. For
this is not an option.</p> serious internationalization purposes, this is not an option.</p>
<p>The other possibility is to set Accept-Encoding to UTF-8, which <p>The other possibility is to set Accept-Encoding to UTF-8, which
begs the question: Why aren't you using UTF-8 for everything then? begs the question: Why aren't you using UTF-8 for everything then?
@ -664,12 +674,12 @@ it up to the module iconv to do the dirty work.</p>
<p>This approach, however, is not perfect. iconv is blithely unaware <p>This approach, however, is not perfect. iconv is blithely unaware
of HTML character entities. HTML Purifier, in order to of HTML character entities. HTML Purifier, in order to
protect against sophisticated escaping schemes, normalizes all character protect against sophisticated escaping schemes, normalizes all character
and numeric entities before processing the text. This leads to and numeric entitie references before processing the text. This leads to
one important ramification:</p> one important ramification:</p>
<p><strong>Any character that is not supported by the target character <p><strong>Any character that is not supported by the target character
set, regardless of whether or not it is in the form of a character set, regardless of whether or not it is in the form of a character
entity or a raw character, will be silently ignored.</strong></p> entity reference or a raw character, will be silently ignored.</strong></p>
<p>Example of this principle at work: say you have <code>&amp;theta;</code> <p>Example of this principle at work: say you have <code>&amp;theta;</code>
in your HTML, but the output is in Latin-1 (which, understandably, in your HTML, but the output is in Latin-1 (which, understandably,
@ -678,7 +688,7 @@ set the encoding correctly using %Core.Encoding):</p>
<ul> <ul>
<li>The <code>Encoder</code> will transform the text from ISO 8859-1 to UTF-8 <li>The <code>Encoder</code> will transform the text from ISO 8859-1 to UTF-8
(note that theta is preserved since it doesn't actually use (note that theta is preserved here since it doesn't actually use
any non-ASCII characters): <code>&amp;theta;</code></li> any non-ASCII characters): <code>&amp;theta;</code></li>
<li>The <code>EntityParser</code> will transform all named and numeric <li>The <code>EntityParser</code> will transform all named and numeric
character entities to their corresponding raw UTF-8 equivalents: character entities to their corresponding raw UTF-8 equivalents:
@ -701,7 +711,7 @@ Purifier has provided a slightly more palatable workaround using
<li>The <code>EntityParser</code> transforms entities: <code>&theta;</code></li> <li>The <code>EntityParser</code> transforms entities: <code>&theta;</code></li>
<li>HTML Purifier processes the code: <code>&theta;</code></li> <li>HTML Purifier processes the code: <code>&theta;</code></li>
<li>The <code>Encoder</code> replaces all non-ASCII characters <li>The <code>Encoder</code> replaces all non-ASCII characters
with numeric entities: <code>&amp;#952;</code></li> with numeric entity reference: <code>&amp;#952;</code></li>
<li>For good measure, <code>Encoder</code> transforms encoding back to <li>For good measure, <code>Encoder</code> transforms encoding back to
original (which is strictly unnecessary for 99% of encodings original (which is strictly unnecessary for 99% of encodings
out there): <code>&amp;#952;</code> (remember, it's all ASCII!)</li> out there): <code>&amp;#952;</code> (remember, it's all ASCII!)</li>
@ -711,19 +721,19 @@ Purifier has provided a slightly more palatable workaround using
the land of Unicode characters, and is totally unacceptable for Chinese the land of Unicode characters, and is totally unacceptable for Chinese
or Japanese texts. The even bigger kicker is that, supposing the or Japanese texts. The even bigger kicker is that, supposing the
input encoding was actually ISO-8859-7, which <em>does</em> support input encoding was actually ISO-8859-7, which <em>does</em> support
theta, the character would get entity-ized anyway! (The Encoder does theta, the character would get converted into a character entity reference
not discriminate).</p> anyway! (The Encoder does not discriminate).</p>
<p>The current functionality is about where HTML Purifier will be for <p>The current functionality is about where HTML Purifier will be for
the rest of eternity. HTML Purifier could attempt to preserve the original the rest of eternity. HTML Purifier could attempt to preserve the original
form of the entities so that they could be substituted back in, only the form of the character references so that they could be substituted back in, only the
DOM extension kills them off irreversibly. HTML Purifier could also attempt DOM extension kills them off irreversibly. HTML Purifier could also attempt
to be smart and only convert non-ASCII characters that weren't supported to be smart and only convert non-ASCII characters that weren't supported
by the target encoding, but that would require reimplementing iconv by the target encoding, but that would require reimplementing iconv
with HTML awareness, something I will not do.</p> with HTML awareness, something I will not do.</p>
<p>So there: either it's UTF-8 or crippled international support. Your pick! (and I'm <p>So there: either it's UTF-8 or crippled international support. Your pick! (and I'm
not being sarcastic here: some people could care less about other languages)</p> not being sarcastic here: some people could care less about other languages).</p>
<h2 id="migrate">Migrate to UTF-8</h2> <h2 id="migrate">Migrate to UTF-8</h2>
@ -985,7 +995,7 @@ and yes, it is variable width. Other traits:</p>
in different ways. It is beyond the scope of this document to explain in different ways. It is beyond the scope of this document to explain
what precisely these implications are. PHPWact provides what precisely these implications are. PHPWact provides
a very good <a href="http://www.phpwact.org/php/i18n/utf-8">reference document</a> a very good <a href="http://www.phpwact.org/php/i18n/utf-8">reference document</a>
on what to expect from each functions, although coverage is spotty in on what to expect from each function, although coverage is spotty in
some areas. Their more general notes on some areas. Their more general notes on
<a href="http://www.phpwact.org/php/i18n/charsets">character sets</a> <a href="http://www.phpwact.org/php/i18n/charsets">character sets</a>
are also worth looking at for information on UTF-8. Some rules of thumb are also worth looking at for information on UTF-8. Some rules of thumb
@ -999,7 +1009,7 @@ when dealing with Unicode text:</p>
<li>Think twice before using functions that:<ul> <li>Think twice before using functions that:<ul>
<li>...count characters (strlen will return bytes, not characters; <li>...count characters (strlen will return bytes, not characters;
str_split and word_wrap may corrupt)</li> str_split and word_wrap may corrupt)</li>
<li>...entity-ize things (UTF-8 doesn't need entities)</li> <li>...convert characters to entity references (UTF-8 doesn't need entities)</li>
<li>...do very complex string processing (*printf)</li> <li>...do very complex string processing (*printf)</li>
</ul></li> </ul></li>
</ul> </ul>

@ -22,7 +22,7 @@
*/ */
/* /*
HTML Purifier 2.1.1 - Standards Compliant HTML Filtering HTML Purifier 2.1.2 - Standards Compliant HTML Filtering
Copyright (C) 2006 Edward Z. Yang Copyright (C) 2006 Edward Z. Yang
This library is free software; you can redistribute it and/or This library is free software; you can redistribute it and/or
@ -77,7 +77,7 @@ This directive has been available since 2.0.0.
class HTMLPurifier class HTMLPurifier
{ {
var $version = '2.1.1'; var $version = '2.1.2';
var $config; var $config;
var $filters; var $filters;

@ -6,6 +6,7 @@ require_once 'HTMLPurifier/URIScheme.php';
require_once 'HTMLPurifier/URISchemeRegistry.php'; require_once 'HTMLPurifier/URISchemeRegistry.php';
require_once 'HTMLPurifier/AttrDef/URI/Host.php'; require_once 'HTMLPurifier/AttrDef/URI/Host.php';
require_once 'HTMLPurifier/PercentEncoder.php'; require_once 'HTMLPurifier/PercentEncoder.php';
require_once 'HTMLPurifier/AttrDef/URI/Email.php';
// special case filtering directives // special case filtering directives

@ -1,6 +1,7 @@
<?php <?php
require_once 'HTMLPurifier/AttrDef.php'; require_once 'HTMLPurifier/AttrDef.php';
require_once 'HTMLPurifier/AttrDef/URI/Email/SimpleCheck.php';
class HTMLPurifier_AttrDef_URI_Email extends HTMLPurifier_AttrDef class HTMLPurifier_AttrDef_URI_Email extends HTMLPurifier_AttrDef
{ {

@ -44,6 +44,9 @@ class HTMLPurifier_AttrTypes
$this->info['LanguageCode'] = new HTMLPurifier_AttrDef_Lang(); $this->info['LanguageCode'] = new HTMLPurifier_AttrDef_Lang();
$this->info['Color'] = new HTMLPurifier_AttrDef_HTML_Color(); $this->info['Color'] = new HTMLPurifier_AttrDef_HTML_Color();
// unimplemented aliases
$this->info['ContentType'] = new HTMLPurifier_AttrDef_Text();
// number is really a positive integer (one or more digits) // number is really a positive integer (one or more digits)
// FIXME: ^^ not always, see start and value of list items // FIXME: ^^ not always, see start and value of list items
$this->info['Number'] = new HTMLPurifier_AttrDef_Integer(false, false, true); $this->info['Number'] = new HTMLPurifier_AttrDef_Integer(false, false, true);

@ -204,7 +204,7 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
$this->info['border-right'] = new HTMLPurifier_AttrDef_CSS_Border($config); $this->info['border-right'] = new HTMLPurifier_AttrDef_CSS_Border($config);
$this->info['border-collapse'] = new HTMLPurifier_AttrDef_Enum(array( $this->info['border-collapse'] = new HTMLPurifier_AttrDef_Enum(array(
'collapse', 'seperate')); 'collapse', 'separate'));
$this->info['caption-side'] = new HTMLPurifier_AttrDef_Enum(array( $this->info['caption-side'] = new HTMLPurifier_AttrDef_Enum(array(
'top', 'bottom')); 'top', 'bottom'));
@ -219,6 +219,8 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
new HTMLPurifier_AttrDef_CSS_Percentage() new HTMLPurifier_AttrDef_CSS_Percentage()
)); ));
$this->info['border-spacing'] = new HTMLPurifier_AttrDef_CSS_Multiple(new HTMLPurifier_AttrDef_CSS_Length(), 2);
// partial support // partial support
$this->info['white-space'] = new HTMLPurifier_AttrDef_Enum(array('nowrap')); $this->info['white-space'] = new HTMLPurifier_AttrDef_Enum(array('nowrap'));

@ -42,7 +42,7 @@ class HTMLPurifier_Config
/** /**
* HTML Purifier's version * HTML Purifier's version
*/ */
var $version = '2.1.1'; var $version = '2.1.2';
/** /**
* Two-level associative array of configuration directives * Two-level associative array of configuration directives

@ -330,7 +330,7 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
if (isset($this->info_content_sets['Block'][$block_wrapper])) { if (isset($this->info_content_sets['Block'][$block_wrapper])) {
$this->info_block_wrapper = $block_wrapper; $this->info_block_wrapper = $block_wrapper;
} else { } else {
trigger_error('Cannot use non-block element as block wrapper.', trigger_error('Cannot use non-block element as block wrapper',
E_USER_ERROR); E_USER_ERROR);
} }
@ -340,7 +340,7 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
$this->info_parent = $parent; $this->info_parent = $parent;
$this->info_parent_def = $def; $this->info_parent_def = $def;
} else { } else {
trigger_error('Cannot use unrecognized element as parent.', trigger_error('Cannot use unrecognized element as parent',
E_USER_ERROR); E_USER_ERROR);
$this->info_parent_def = $this->manager->getElement($this->info_parent, true); $this->info_parent_def = $this->manager->getElement($this->info_parent, true);
} }

@ -0,0 +1,47 @@
<?php
require_once 'HTMLPurifier/HTMLModule.php';
/**
* XHTML 1.1 Object Module, defines elements for generic object inclusion
* @warning Users will commonly use <embed> to cater to legacy browsers: this
* module does not allow this sort of behavior
*/
class HTMLPurifier_HTMLModule_Object extends HTMLPurifier_HTMLModule
{
var $name = 'Object';
function HTMLPurifier_HTMLModule_Object() {
$this->addElement('object', false, 'Inline', 'Optional: #PCDATA | Flow | param', 'Common',
array(
'archive' => 'URI',
'classid' => 'URI',
'codebase' => 'URI',
'codetype' => 'Text',
'data' => 'URI',
'declare' => 'Bool#declare',
'height' => 'Length',
'name' => 'CDATA',
'standby' => 'Text',
'tabindex' => 'Number',
'type' => 'ContentType',
'width' => 'Length'
)
);
$this->addElement('param', false, false, 'Empty', false,
array(
'id' => 'ID',
'name*' => 'Text',
'type' => 'Text',
'value' => 'Text',
'valuetype' => 'Enum#data,ref,object'
)
);
}
}

@ -29,6 +29,7 @@ require_once 'HTMLPurifier/HTMLModule/Scripting.php';
require_once 'HTMLPurifier/HTMLModule/XMLCommonAttributes.php'; require_once 'HTMLPurifier/HTMLModule/XMLCommonAttributes.php';
require_once 'HTMLPurifier/HTMLModule/NonXMLCommonAttributes.php'; require_once 'HTMLPurifier/HTMLModule/NonXMLCommonAttributes.php';
require_once 'HTMLPurifier/HTMLModule/Ruby.php'; require_once 'HTMLPurifier/HTMLModule/Ruby.php';
require_once 'HTMLPurifier/HTMLModule/Object.php';
// tidy modules // tidy modules
require_once 'HTMLPurifier/HTMLModule/Tidy.php'; require_once 'HTMLPurifier/HTMLModule/Tidy.php';
@ -172,7 +173,7 @@ class HTMLPurifier_HTMLModuleManager
$common = array( $common = array(
'CommonAttributes', 'Text', 'Hypertext', 'List', 'CommonAttributes', 'Text', 'Hypertext', 'List',
'Presentation', 'Edit', 'Bdo', 'Tables', 'Image', 'Presentation', 'Edit', 'Bdo', 'Tables', 'Image',
'StyleAttribute', 'Scripting' 'StyleAttribute', 'Scripting', 'Object'
); );
$transitional = array('Legacy', 'Target'); $transitional = array('Legacy', 'Target');
$xml = array('XMLCommonAttributes'); $xml = array('XMLCommonAttributes');

@ -189,6 +189,9 @@ class HTMLPurifier_Lexer
return new HTMLPurifier_Lexer_DOMLex(); return new HTMLPurifier_Lexer_DOMLex();
case 'DirectLex': case 'DirectLex':
return new HTMLPurifier_Lexer_DirectLex(); return new HTMLPurifier_Lexer_DirectLex();
case 'PH5P':
// experimental Lexer that must be manually included
return new HTMLPurifier_Lexer_PH5P();
default: default:
trigger_error("Cannot instantiate unrecognized Lexer type " . htmlspecialchars($lexer), E_USER_ERROR); trigger_error("Cannot instantiate unrecognized Lexer type " . htmlspecialchars($lexer), E_USER_ERROR);
} }

@ -53,14 +53,7 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
} }
// preprocess html, essential for UTF-8 // preprocess html, essential for UTF-8
$html = $html = $this->wrapHTML($html, $config, $context);
'<!DOCTYPE html '.
'PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"'.
'"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">'.
'<html><head>'.
'<meta http-equiv="Content-Type" content="text/html;'.
' charset=utf-8" />'.
'</head><body><div>'.$html.'</div></body></html>';
$doc = new DOMDocument(); $doc = new DOMDocument();
$doc->encoding = 'UTF-8'; // theoretically, the above has this covered $doc->encoding = 'UTF-8'; // theoretically, the above has this covered
@ -177,5 +170,25 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
return '<!--' . str_replace('&', '&amp;', $matches[1]) . $matches[2]; return '<!--' . str_replace('&', '&amp;', $matches[1]) . $matches[2];
} }
/**
* Wraps an HTML fragment in the necessary HTML
*/
function wrapHTML($html, $config, &$context) {
$def = $config->getDefinition('HTML');
$ret = '';
if (!empty($def->doctype->dtdPublic) || !empty($def->doctype->dtdSystem)) {
$ret .= '<!DOCTYPE html ';
if (!empty($def->doctype->dtdPublic)) $ret .= 'PUBLIC "' . $def->doctype->dtdPublic . '" ';
if (!empty($def->doctype->dtdSystem)) $ret .= '"' . $def->doctype->dtdSystem . '" ';
$ret .= '>';
}
$ret .= '<html><head>';
$ret .= '<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />';
$ret .= '</head><body><div>'.$html.'</div></body></html>';
return $ret;
}
} }

@ -237,7 +237,7 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
// trailing slash. Remember, we could have a tag like <br>, so // trailing slash. Remember, we could have a tag like <br>, so
// any later token processing scripts must convert improperly // any later token processing scripts must convert improperly
// classified EmptyTags from StartTags. // classified EmptyTags from StartTags.
$is_self_closing= (strpos($segment,'/') === $strlen_segment-1); $is_self_closing= (strrpos($segment,'/') === $strlen_segment-1);
if ($is_self_closing) { if ($is_self_closing) {
$strlen_segment--; $strlen_segment--;
$segment = substr($segment, 0, $strlen_segment); $segment = substr($segment, 0, $strlen_segment);

File diff suppressed because it is too large Load Diff

@ -25,7 +25,9 @@ class HTMLPurifier_Printer_ConfigForm extends HTMLPurifier_Printer
/** /**
* Whether or not to compress directive names, clipping them off * Whether or not to compress directive names, clipping them off
* after a certain amount of letters * after a certain amount of letters. False to disable or integer letters
* before clipping.
* @protected
*/ */
var $compress = false; var $compress = false;
@ -41,11 +43,13 @@ class HTMLPurifier_Printer_ConfigForm extends HTMLPurifier_Printer
$this->docURL = $doc_url; $this->docURL = $doc_url;
$this->name = $name; $this->name = $name;
$this->compress = $compress; $this->compress = $compress;
// initialize sub-printers
$this->fields['default'] = new HTMLPurifier_Printer_ConfigForm_default(); $this->fields['default'] = new HTMLPurifier_Printer_ConfigForm_default();
$this->fields['bool'] = new HTMLPurifier_Printer_ConfigForm_bool(); $this->fields['bool'] = new HTMLPurifier_Printer_ConfigForm_bool();
} }
/** /**
* Sets default column and row size for textareas in sub-printers
* @param $cols Integer columns of textarea, null to use default * @param $cols Integer columns of textarea, null to use default
* @param $rows Integer rows of textarea, null to use default * @param $rows Integer rows of textarea, null to use default
*/ */
@ -55,15 +59,14 @@ class HTMLPurifier_Printer_ConfigForm extends HTMLPurifier_Printer
} }
/** /**
* Retrieves styling, in case the directory it's in is not publically * Retrieves styling, in case it is not accessible by webserver
* available
*/ */
function getCSS() { function getCSS() {
return file_get_contents(HTMLPURIFIER_PREFIX . '/HTMLPurifier/Printer/ConfigForm.css'); return file_get_contents(HTMLPURIFIER_PREFIX . '/HTMLPurifier/Printer/ConfigForm.css');
} }
/** /**
* Retrieves JavaScript, in case directory is not public * Retrieves JavaScript, in case it is not accessible by webserver
*/ */
function getJavaScript() { function getJavaScript() {
return file_get_contents(HTMLPURIFIER_PREFIX . '/HTMLPurifier/Printer/ConfigForm.js'); return file_get_contents(HTMLPURIFIER_PREFIX . '/HTMLPurifier/Printer/ConfigForm.js');
@ -97,14 +100,14 @@ class HTMLPurifier_Printer_ConfigForm extends HTMLPurifier_Printer
$ret .= $this->renderNamespace($ns, $directives); $ret .= $this->renderNamespace($ns, $directives);
} }
if ($render_controls) { if ($render_controls) {
$ret .= $this->start('tfoot'); $ret .= $this->start('tbody');
$ret .= $this->start('tr'); $ret .= $this->start('tr');
$ret .= $this->start('td', array('colspan' => 2, 'class' => 'controls')); $ret .= $this->start('td', array('colspan' => 2, 'class' => 'controls'));
$ret .= $this->elementEmpty('input', array('type' => 'Submit', 'value' => 'Submit')); $ret .= $this->elementEmpty('input', array('type' => 'submit', 'value' => 'Submit'));
$ret .= '[<a href="?">Reset</a>]'; $ret .= '[<a href="?">Reset</a>]';
$ret .= $this->end('td'); $ret .= $this->end('td');
$ret .= $this->end('tr'); $ret .= $this->end('tr');
$ret .= $this->end('tfoot'); $ret .= $this->end('tbody');
} }
$ret .= $this->end('table'); $ret .= $this->end('table');
return $ret; return $ret;

@ -102,6 +102,7 @@ class HTMLPurifier_Printer_HTMLDefinition extends HTMLPurifier_Printer
$ret .= $this->element('td', $this->listifyTagLookup($lookup)); $ret .= $this->element('td', $this->listifyTagLookup($lookup));
$ret .= $this->end('tr'); $ret .= $this->end('tr');
} }
$ret .= $this->end('table');
return $ret; return $ret;
} }
@ -179,7 +180,8 @@ class HTMLPurifier_Printer_HTMLDefinition extends HTMLPurifier_Printer
$def->validateChildren(array(), $this->config, $context); $def->validateChildren(array(), $this->config, $context);
} }
$elements = $def->elements; $elements = $def->elements;
} elseif ($def->type == 'chameleon') { }
if ($def->type == 'chameleon') {
$attr['rowspan'] = 2; $attr['rowspan'] = 2;
} elseif ($def->type == 'empty') { } elseif ($def->type == 'empty') {
$elements = array(); $elements = array();

@ -1,5 +1,12 @@
<?php <?php
require_once 'HTMLPurifier/URIScheme/http.php';
require_once 'HTMLPurifier/URIScheme/https.php';
require_once 'HTMLPurifier/URIScheme/mailto.php';
require_once 'HTMLPurifier/URIScheme/ftp.php';
require_once 'HTMLPurifier/URIScheme/nntp.php';
require_once 'HTMLPurifier/URIScheme/news.php';
HTMLPurifier_ConfigSchema::define( HTMLPurifier_ConfigSchema::define(
'URI', 'AllowedSchemes', array( 'URI', 'AllowedSchemes', array(
'http' => true, // "Hypertext Transfer Protocol", nuf' said 'http' => true, // "Hypertext Transfer Protocol", nuf' said
@ -7,7 +14,6 @@ HTMLPurifier_ConfigSchema::define(
// quite useful, but not necessary // quite useful, but not necessary
'mailto' => true,// Email 'mailto' => true,// Email
'ftp' => true, // "File Transfer Protocol" 'ftp' => true, // "File Transfer Protocol"
'irc' => true, // "Internet Relay Chat", usually needs another app
// for Usenet, these two are similar, but distinct // for Usenet, these two are similar, but distinct
'nntp' => true, // individual Netnews articles 'nntp' => true, // individual Netnews articles
'news' => true // newsgroup or individual Netnews articles 'news' => true // newsgroup or individual Netnews articles
@ -54,12 +60,6 @@ class HTMLPurifier_URISchemeRegistry
*/ */
var $schemes = array(); var $schemes = array();
/**
* Directory where scheme objects can be found
* @private
*/
var $_scheme_dir = null;
/** /**
* Retrieves a scheme validator object * Retrieves a scheme validator object
* @param $scheme String scheme name like http or mailto * @param $scheme String scheme name like http or mailto
@ -79,21 +79,16 @@ class HTMLPurifier_URISchemeRegistry
} }
if (isset($this->schemes[$scheme])) return $this->schemes[$scheme]; if (isset($this->schemes[$scheme])) return $this->schemes[$scheme];
if (empty($this->_dir)) $this->_dir = HTMLPURIFIER_PREFIX . '/HTMLPurifier/URIScheme/';
if (!isset($allowed_schemes[$scheme])) return $null; if (!isset($allowed_schemes[$scheme])) return $null;
// this bit of reflection is not very efficient, and a bit
// hacky too
$class = 'HTMLPurifier_URIScheme_' . $scheme; $class = 'HTMLPurifier_URIScheme_' . $scheme;
if (!class_exists($class)) include_once $this->_dir . $scheme . '.php';
if (!class_exists($class)) return $null; if (!class_exists($class)) return $null;
$this->schemes[$scheme] = new $class(); $this->schemes[$scheme] = new $class();
return $this->schemes[$scheme]; return $this->schemes[$scheme];
} }
/** /**
* Registers a custom scheme to the cache. * Registers a custom scheme to the cache, bypassing reflection.
* @param $scheme Scheme name * @param $scheme Scheme name
* @param $scheme_obj HTMLPurifier_URIScheme object * @param $scheme_obj HTMLPurifier_URIScheme object
*/ */

45
maintenance/PH5P.patch Normal file

@ -0,0 +1,45 @@
--- old.php 2007-08-19 14:42:33.640625000 -0400
+++ new.php 2007-08-19 14:41:51.609375000 -0400
@@ -211,7 +211,10 @@
// If nothing is returned, emit a U+0026 AMPERSAND character token.
// Otherwise, emit the character token that was returned.
$char = (!$entity) ? '&' : $entity;
- $this->emitToken($char);
+ $this->emitToken(array(
+ 'type' => self::CHARACTR,
+ 'data' => $char
+ ));
// Finally, switch to the data state.
$this->state = 'data';
@@ -708,7 +711,7 @@
} elseif($char === '&') {
/* U+0026 AMPERSAND (&)
Switch to the entity in attribute value state. */
- $this->entityInAttributeValueState('non');
+ $this->entityInAttributeValueState();
} elseif($char === '>') {
/* U+003E GREATER-THAN SIGN (>)
@@ -738,7 +741,8 @@
? '&'
: $entity;
- $this->emitToken($char);
+ $last = count($this->token['attr']) - 1;
+ $this->token['attr'][$last]['value'] .= $char;
}
private function bogusCommentState() {
@@ -1066,6 +1070,11 @@
$this->char++;
if(in_array($id, $this->entities)) {
+ if ($e_name[$c-1] !== ';') {
+ if ($c < $len && $e_name[$c] == ';') {
+ $this->char++; // consume extra semicolon
+ }
+ }
$entity = $id;
break;
}

@ -1,5 +1,7 @@
<?php <?php
require_once 'compat-function-file-put-contents.php';
function assertCli() { function assertCli() {
if (php_sapi_name() != 'cli' && !getenv('PHP_IS_CLI')) { if (php_sapi_name() != 'cli' && !getenv('PHP_IS_CLI')) {
echo 'Script cannot be called from web-browser (if you are calling via cli, echo 'Script cannot be called from web-browser (if you are calling via cli,
@ -7,3 +9,135 @@ set environment variable PHP_IS_CLI to work around this).';
exit; exit;
} }
} }
/**
* Filesystem tools not provided by default; can recursively create, copy
* and delete folders. Some template methods are provided for extensibility.
* @note This class must be instantiated to be used, although it does
* not maintain state.
*/
class FSTools
{
/**
* Recursively creates a directory
* @param string $folder Name of folder to create
* @note Adapted from the PHP manual comment 76612
*/
function mkdir($folder) {
$folders = preg_split("#[\\\\/]#", $folder);
$base = '';
for($i = 0, $c = count($folders); $i < $c; $i++) {
if(empty($folders[$i])) {
if (!$i) {
// special case for root level
$base .= DIRECTORY_SEPARATOR;
}
continue;
}
$base .= $folders[$i];
if(!is_dir($base)){
mkdir($base);
}
$base .= DIRECTORY_SEPARATOR;
}
}
/**
* Copy a file, or recursively copy a folder and its contents; modified
* so that copied files, if PHP, have includes removed
*
* @author Aidan Lister <aidan@php.net>
* @version 1.0.1-modified
* @link http://aidanlister.com/repos/v/function.copyr.php
* @param string $source Source path
* @param string $dest Destination path
* @return bool Returns TRUE on success, FALSE on failure
*/
function copyr($source, $dest) {
// Simple copy for a file
if (is_file($source)) {
return $this->copy($source, $dest);
}
// Make destination directory
if (!is_dir($dest)) {
mkdir($dest);
}
// Loop through the folder
$dir = dir($source);
while (false !== $entry = $dir->read()) {
// Skip pointers
if ($entry == '.' || $entry == '..') {
continue;
}
if (!$this->copyable($entry)) {
continue;
}
// Deep copy directories
if ($dest !== "$source/$entry") {
$this->copyr("$source/$entry", "$dest/$entry");
}
}
// Clean up
$dir->close();
return true;
}
/**
* Stub for PHP's built-in copy function, can be used to overload
* functionality
*/
function copy($source, $dest) {
return copy($source, $dest);
}
/**
* Overloadable function that tests a filename for copyability. By
* default, everything should be copied; you can restrict things to
* ignore hidden files, unreadable files, etc.
*/
function copyable($file) {
return true;
}
/**
* Delete a file, or a folder and its contents
*
* @author Aidan Lister <aidan@php.net>
* @version 1.0.3
* @link http://aidanlister.com/repos/v/function.rmdirr.php
* @param string $dirname Directory to delete
* @return bool Returns TRUE on success, FALSE on failure
*/
function rmdirr($dirname)
{
// Sanity check
if (!file_exists($dirname)) {
return false;
}
// Simple delete for a file
if (is_file($dirname) || is_link($dirname)) {
return unlink($dirname);
}
// Loop through the folder
$dir = dir($dirname);
while (false !== $entry = $dir->read()) {
// Skip pointers
if ($entry == '.' || $entry == '..') {
continue;
}
// Recurse
$this->rmdirr($dirname . DIRECTORY_SEPARATOR . $entry);
}
// Clean up
$dir->close();
return rmdir($dirname);
}
}

@ -0,0 +1,107 @@
<?php
// $Id: file_put_contents.php,v 1.27 2007/04/17 10:09:56 arpad Exp $
if (!defined('FILE_USE_INCLUDE_PATH')) {
define('FILE_USE_INCLUDE_PATH', 1);
}
if (!defined('LOCK_EX')) {
define('LOCK_EX', 2);
}
if (!defined('FILE_APPEND')) {
define('FILE_APPEND', 8);
}
/**
* Replace file_put_contents()
*
* @category PHP
* @package PHP_Compat
* @license LGPL - http://www.gnu.org/licenses/lgpl.html
* @copyright 2004-2007 Aidan Lister <aidan@php.net>, Arpad Ray <arpad@php.net>
* @link http://php.net/function.file_put_contents
* @author Aidan Lister <aidan@php.net>
* @version $Revision: 1.27 $
* @internal resource_context is not supported
* @since PHP 5
* @require PHP 4.0.0 (user_error)
*/
function php_compat_file_put_contents($filename, $content, $flags = null, $resource_context = null)
{
// If $content is an array, convert it to a string
if (is_array($content)) {
$content = implode('', $content);
}
// If we don't have a string, throw an error
if (!is_scalar($content)) {
user_error('file_put_contents() The 2nd parameter should be either a string or an array',
E_USER_WARNING);
return false;
}
// Get the length of data to write
$length = strlen($content);
// Check what mode we are using
$mode = ($flags & FILE_APPEND) ?
'a' :
'wb';
// Check if we're using the include path
$use_inc_path = ($flags & FILE_USE_INCLUDE_PATH) ?
true :
false;
// Open the file for writing
if (($fh = @fopen($filename, $mode, $use_inc_path)) === false) {
user_error('file_put_contents() failed to open stream: Permission denied',
E_USER_WARNING);
return false;
}
// Attempt to get an exclusive lock
$use_lock = ($flags & LOCK_EX) ? true : false ;
if ($use_lock === true) {
if (!flock($fh, LOCK_EX)) {
return false;
}
}
// Write to the file
$bytes = 0;
if (($bytes = @fwrite($fh, $content)) === false) {
$errormsg = sprintf('file_put_contents() Failed to write %d bytes to %s',
$length,
$filename);
user_error($errormsg, E_USER_WARNING);
return false;
}
// Close the handle
@fclose($fh);
// Check all the data was written
if ($bytes != $length) {
$errormsg = sprintf('file_put_contents() Only %d of %d bytes written, possibly out of free disk space.',
$bytes,
$length);
user_error($errormsg, E_USER_WARNING);
return false;
}
// Return length
return $bytes;
}
// Define
if (!function_exists('file_put_contents')) {
function file_put_contents($filename, $content, $flags = null, $resource_context = null)
{
return php_compat_file_put_contents($filename, $content, $flags, $resource_context);
}
}

@ -6,20 +6,38 @@ assertCli();
/** /**
* Compiles all of HTML Purifier's library files into one big file * Compiles all of HTML Purifier's library files into one big file
* named HTMLPurifier.standalone.php. Operates recursively, and will * named HTMLPurifier.standalone.php.
* barf if there are conditional includes.
*
* Details: also creates blank "include" files in the test/blank directory
* in order to simulate require_once's inside the test files.
*/ */
/** /**
* Global array that tracks already loaded includes * Global hash that tracks already loaded includes
*/ */
$GLOBALS['loaded'] = array('HTMLPurifier.php' => true); $GLOBALS['loaded'] = array('HTMLPurifier.php' => true);
/** /**
* @param $text Text to replace includes from * Custom FSTools for this script that overloads some behavior
* @warning The overloading of copy() is not necessarily global for
* this script. Watch out!
*/
class MergeLibraryFSTools extends FSTools
{
function copyable($entry) {
// Skip hidden files
if ($entry[0] == '.') {
return false;
}
return true;
}
function copy($source, $dest) {
copy_and_remove_includes($source, $dest);
}
}
$FS = new MergeLibraryFSTools();
/**
* Replaces the includes inside PHP source code with the corresponding
* source.
* @param string $text PHP source code to replace includes from
*/ */
function replace_includes($text) { function replace_includes($text) {
return preg_replace_callback( return preg_replace_callback(
@ -32,6 +50,8 @@ function replace_includes($text) {
/** /**
* Removes leading PHP tags from included files. Assumes that there is * Removes leading PHP tags from included files. Assumes that there is
* no trailing tag. * no trailing tag.
* @note This is safe for files that have internal <?php
* @param string $text Text to have leading PHP tag from
*/ */
function remove_php_tags($text) { function remove_php_tags($text) {
return substr($text, 5); return substr($text, 5);
@ -40,125 +60,48 @@ function remove_php_tags($text) {
/** /**
* Creates an appropriate blank file, recursively generating directories * Creates an appropriate blank file, recursively generating directories
* if necessary * if necessary
* @param string $file Filename to create blank for
*/ */
function create_blank($file) { function create_blank($file) {
global $FS;
$dir = dirname($file); $dir = dirname($file);
$base = realpath('../tests/blanks/') . DIRECTORY_SEPARATOR ; $base = realpath('../tests/blanks/') . DIRECTORY_SEPARATOR ;
if ($dir != '.') mkdir_deep($base . $dir); if ($dir != '.') {
$FS->mkdir($base . $dir);
}
file_put_contents($base . $file, ''); file_put_contents($base . $file, '');
} }
/** /**
* Recursively creates a directory * Copies the contents of a directory to the standalone directory
* @note Adapted from the PHP manual comment 76612 * @param string $dir Directory to copy
*/ */
function mkdir_deep($folder) { function make_dir_standalone($dir) {
$folders = preg_split("#[\\\\/]#", $folder); global $FS;
$base = ''; return $FS->copyr($dir, 'standalone/' . $dir);
for($i = 0, $c = count($folders); $i < $c; $i++) {
if(empty($folders[$i])) {
if (!$i) {
// special case for root level
$base .= DIRECTORY_SEPARATOR;
}
continue;
}
$base .= $folders[$i];
if(!is_dir($base)){
mkdir($base);
}
$base .= DIRECTORY_SEPARATOR;
}
} }
/** /**
* Copy a file, or recursively copy a folder and its contents * Copies the contents of a file to the standalone directory
* * @param string $file File to copy
* @author Aidan Lister <aidan@php.net>
* @version 1.0.1
* @link http://aidanlister.com/repos/v/function.copyr.php
* @param string $source Source path
* @param string $dest Destination path
* @return bool Returns TRUE on success, FALSE on failure
*/ */
function copyr($source, $dest) { function make_file_standalone($file) {
// Simple copy for a file global $FS;
if (is_file($source)) { $FS->mkdir('standalone/' . dirname($file));
return copy($source, $dest); copy_and_remove_includes($file, 'standalone/' . $file);
}
// Make destination directory
if (!is_dir($dest)) {
mkdir($dest);
}
// Loop through the folder
$dir = dir($source);
while (false !== $entry = $dir->read()) {
// Skip pointers
if ($entry == '.' || $entry == '..') {
continue;
}
// Skip hidden files
if ($entry[0] == '.') {
continue;
}
// Deep copy directories
if ($dest !== "$source/$entry") {
copyr("$source/$entry", "$dest/$entry");
}
}
// Clean up
$dir->close();
return true; return true;
} }
/** /**
* Delete a file, or a folder and its contents * Copies a file to another location recursively, if it is a PHP file
* * remove includes
* @author Aidan Lister <aidan@php.net> * @param string $file Original file
* @version 1.0.3 * @param string $sfile New location of file
* @link http://aidanlister.com/repos/v/function.rmdirr.php
* @param string $dirname Directory to delete
* @return bool Returns TRUE on success, FALSE on failure
*/ */
function rmdirr($dirname) function copy_and_remove_includes($file, $sfile) {
{ $contents = file_get_contents($file);
// Sanity check if (strrchr($file, '.') === '.php') $contents = replace_includes($contents);
if (!file_exists($dirname)) { return file_put_contents($sfile, $contents);
return false;
}
// Simple delete for a file
if (is_file($dirname) || is_link($dirname)) {
return unlink($dirname);
}
// Loop through the folder
$dir = dir($dirname);
while (false !== $entry = $dir->read()) {
// Skip pointers
if ($entry == '.' || $entry == '..') {
continue;
}
// Recurse
rmdirr($dirname . DIRECTORY_SEPARATOR . $entry);
}
// Clean up
$dir->close();
return rmdir($dirname);
}
/**
* Copies the contents of a directory to the standalone directory
*/
function make_dir_standalone($dir) {
return copyr($dir, 'standalone/' . $dir);
}
function make_file_standalone($file) {
mkdir_deep('standalone/' . dirname($file));
return copy($file, 'standalone/' . $file);
} }
/** /**
@ -167,8 +110,14 @@ function make_file_standalone($file) {
*/ */
function replace_includes_callback($matches) { function replace_includes_callback($matches) {
$file = $matches[1]; $file = $matches[1];
// PHP 5 only file $preserve = array(
if ($file == 'HTMLPurifier/Lexer/DOMLex.php') { // PHP 5 only
'HTMLPurifier/Lexer/DOMLex.php' => 1,
'HTMLPurifier/Printer.php' => 1,
// PEAR (external)
'XML/HTMLSax3.php' => 1
);
if (isset($preserve[$file])) {
return $matches[0]; return $matches[0];
} }
if (isset($GLOBALS['loaded'][$file])) return ''; if (isset($GLOBALS['loaded'][$file])) return '';
@ -192,16 +141,22 @@ file_put_contents('HTMLPurifier.standalone.php', $contents);
echo ' done!' . PHP_EOL; echo ' done!' . PHP_EOL;
echo 'Creating standalone directory...'; echo 'Creating standalone directory...';
rmdirr('standalone'); // ensure a clean copy $FS->rmdirr('standalone'); // ensure a clean copy
mkdir_deep('standalone/HTMLPurifier/DefinitionCache/Serializer');
make_dir_standalone('HTMLPurifier/EntityLookup');
make_dir_standalone('HTMLPurifier/Language');
make_file_standalone('HTMLPurifier/Printer/ConfigForm.js');
make_file_standalone('HTMLPurifier/Printer/ConfigForm.css');
make_dir_standalone('HTMLPurifier/URIScheme');
// PHP 5 only file
mkdir_deep('standalone/HTMLPurifier/Lexer');
make_file_standalone('HTMLPurifier/Lexer/DOMLex.php');
make_file_standalone('HTMLPurifier/TokenFactory.php');
echo ' done!' . PHP_EOL;
// data files
$FS->mkdir('standalone/HTMLPurifier/DefinitionCache/Serializer');
make_dir_standalone('HTMLPurifier/EntityLookup');
// non-standard inclusion setup
make_dir_standalone('HTMLPurifier/Language');
// optional components
make_file_standalone('HTMLPurifier/Printer.php');
make_dir_standalone('HTMLPurifier/Printer');
make_dir_standalone('HTMLPurifier/Filter');
make_file_standalone('HTMLPurifier/Lexer/PEARSax3.php');
// PHP 5 only files
make_file_standalone('HTMLPurifier/Lexer/DOMLex.php');
make_file_standalone('HTMLPurifier/Lexer/PH5P.php');
echo ' done!' . PHP_EOL;

@ -31,7 +31,7 @@ while (false !== ($filename = readdir($dh))) {
if ($filename == 'all.php') continue; if ($filename == 'all.php') continue;
if ($filename == 'testSchema.php') continue; if ($filename == 'testSchema.php') continue;
?> ?>
<iframe src="<?php echo escapeHTML($filename); ?>"></iframe> <iframe src="<?php echo escapeHTML($filename); if (isset($_GET['standalone'])) {echo '?standalone';} ?>"></iframe>
<?php <?php
} }

@ -2,7 +2,11 @@
header('Content-type: text/html; charset=UTF-8'); header('Content-type: text/html; charset=UTF-8');
require_once '../library/HTMLPurifier.auto.php'; if (!isset($_GET['standalone'])) {
require_once '../library/HTMLPurifier.auto.php';
} else {
require_once '../library/HTMLPurifier.standalone.php';
}
error_reporting(E_ALL | E_STRICT); error_reporting(E_ALL | E_STRICT);
function escapeHTML($string) { function escapeHTML($string) {

@ -54,14 +54,14 @@ function isInScopes($array = array()) {
} }
/**#@-*/ /**#@-*/
function printTokens($tokens, $index) { function printTokens($tokens, $index = null) {
$string = '<pre>'; $string = '<pre>';
$generator = new HTMLPurifier_Generator(); $generator = new HTMLPurifier_Generator();
foreach ($tokens as $i => $token) { foreach ($tokens as $i => $token) {
if ($index == $i) $string .= '[<strong>'; if ($index === $i) $string .= '[<strong>';
$string .= "<sup>$i</sup>"; $string .= "<sup>$i</sup>";
$string .= $generator->escape($generator->generateFromToken($token)); $string .= $generator->escape($generator->generateFromToken($token));
if ($index == $i) $string .= '</strong>]'; if ($index === $i) $string .= '</strong>]';
} }
$string .= '</pre>'; $string .= '</pre>';
echo $string; echo $string;

@ -67,6 +67,7 @@ class HTMLPurifier_AttrDef_CSSTest extends HTMLPurifier_AttrDefHarness
$this->assertDef('border:1px solid #000;'); $this->assertDef('border:1px solid #000;');
$this->assertDef('border-bottom:2em double #FF00FA;'); $this->assertDef('border-bottom:2em double #FF00FA;');
$this->assertDef('border-collapse:collapse;'); $this->assertDef('border-collapse:collapse;');
$this->assertDef('border-collapse:separate;');
$this->assertDef('caption-side:top;'); $this->assertDef('caption-side:top;');
$this->assertDef('vertical-align:middle;'); $this->assertDef('vertical-align:middle;');
$this->assertDef('vertical-align:12px;'); $this->assertDef('vertical-align:12px;');
@ -79,6 +80,8 @@ class HTMLPurifier_AttrDef_CSSTest extends HTMLPurifier_AttrDefHarness
$this->assertDef('background-repeat:repeat-y;'); $this->assertDef('background-repeat:repeat-y;');
$this->assertDef('background-attachment:fixed;'); $this->assertDef('background-attachment:fixed;');
$this->assertDef('background-position:left 90%;'); $this->assertDef('background-position:left 90%;');
$this->assertDef('border-spacing:1em;');
$this->assertDef('border-spacing:1em 2em;');
// duplicates // duplicates
$this->assertDef('text-align:right;text-align:left;', $this->assertDef('text-align:right;text-align:left;',

@ -11,18 +11,19 @@ class HTMLPurifier_AttrTransform_BdoDirTest extends HTMLPurifier_AttrTransformHa
$this->obj = new HTMLPurifier_AttrTransform_BdoDir(); $this->obj = new HTMLPurifier_AttrTransform_BdoDir();
} }
function test() { function testAddDefaultDir() {
$this->assertResult( array(), array('dir' => 'ltr') ); $this->assertResult( array(), array('dir' => 'ltr') );
}
// leave existing dir alone
function testPreserveExistingDir() {
$this->assertResult( array('dir' => 'rtl') ); $this->assertResult( array('dir' => 'rtl') );
}
// use a different default
function testAlternateDefault() {
$this->config->set('Attr', 'DefaultTextDir', 'rtl');
$this->assertResult( $this->assertResult(
array(), array(),
array('dir' => 'rtl'), array('dir' => 'rtl')
array('Attr.DefaultTextDir' => 'rtl')
); );
} }

@ -3,6 +3,10 @@
require_once 'HTMLPurifier/AttrTransform/BgColor.php'; require_once 'HTMLPurifier/AttrTransform/BgColor.php';
require_once 'HTMLPurifier/AttrTransformHarness.php'; require_once 'HTMLPurifier/AttrTransformHarness.php';
// we currently rely on the CSS validator to fix any problems.
// This means that this transform, strictly speaking, supports
// a superset of the functionality.
class HTMLPurifier_AttrTransform_BgColorTest extends HTMLPurifier_AttrTransformHarness class HTMLPurifier_AttrTransform_BgColorTest extends HTMLPurifier_AttrTransformHarness
{ {
@ -11,31 +15,31 @@ class HTMLPurifier_AttrTransform_BgColorTest extends HTMLPurifier_AttrTransformH
$this->obj = new HTMLPurifier_AttrTransform_BgColor(); $this->obj = new HTMLPurifier_AttrTransform_BgColor();
} }
function test() { function testEmptyInput() {
$this->assertResult( array() ); $this->assertResult( array() );
}
// we currently rely on the CSS validator to fix any problems.
// This means that this transform, strictly speaking, supports function testBasicTransform() {
// a superset of the functionality.
$this->assertResult( $this->assertResult(
array('bgcolor' => '#000000'), array('bgcolor' => '#000000'),
array('style' => 'background-color:#000000;') array('style' => 'background-color:#000000;')
); );
}
function testPrependNewCSS() {
$this->assertResult( $this->assertResult(
array('bgcolor' => '#000000', 'style' => 'font-weight:bold'), array('bgcolor' => '#000000', 'style' => 'font-weight:bold'),
array('style' => 'background-color:#000000;font-weight:bold') array('style' => 'background-color:#000000;font-weight:bold')
); );
}
function testLenientTreatmentOfInvalidInput() {
// this may change when we natively support the datatype and // this may change when we natively support the datatype and
// validate its contents before forwarding it on // validate its contents before forwarding it on
$this->assertResult( $this->assertResult(
array('bgcolor' => '#F00'), array('bgcolor' => '#F00'),
array('style' => 'background-color:#F00;') array('style' => 'background-color:#F00;')
); );
} }
} }

@ -11,27 +11,29 @@ class HTMLPurifier_AttrTransform_BoolToCSSTest extends HTMLPurifier_AttrTransfor
$this->obj = new HTMLPurifier_AttrTransform_BoolToCSS('foo', 'bar:3in;'); $this->obj = new HTMLPurifier_AttrTransform_BoolToCSS('foo', 'bar:3in;');
} }
function test() { function testEmptyInput() {
$this->assertResult( array() ); $this->assertResult( array() );
}
function testBasicTransform() {
$this->assertResult( $this->assertResult(
array('foo' => 'foo'), array('foo' => 'foo'),
array('style' => 'bar:3in;') array('style' => 'bar:3in;')
); );
}
// boolean attribute just has to be set: we don't care about
// anything else function testIgnoreValueOfBooleanAttribute() {
$this->assertResult( $this->assertResult(
array('foo' => 'no'), array('foo' => 'no'),
array('style' => 'bar:3in;') array('style' => 'bar:3in;')
); );
}
function testPrependCSS() {
$this->assertResult( $this->assertResult(
array('foo' => 'foo', 'style' => 'background-color:#F00;'), array('foo' => 'foo', 'style' => 'background-color:#F00;'),
array('style' => 'bar:3in;background-color:#F00;') array('style' => 'bar:3in;background-color:#F00;')
); );
} }
} }

@ -12,27 +12,29 @@ class HTMLPurifier_AttrTransform_BorderTest extends HTMLPurifier_AttrTransformHa
$this->obj = new HTMLPurifier_AttrTransform_Border(); $this->obj = new HTMLPurifier_AttrTransform_Border();
} }
function test() { function testEmptyInput() {
$this->assertResult( array() ); $this->assertResult( array() );
}
function testBasicTransform() {
$this->assertResult( $this->assertResult(
array('border' => '1'), array('border' => '1'),
array('style' => 'border:1px solid;') array('style' => 'border:1px solid;')
); );
}
// once again, no validation done here, we expect CSS validator
// to catch it function testLenientTreatmentOfInvalidInput() {
$this->assertResult( $this->assertResult(
array('border' => '10%'), array('border' => '10%'),
array('style' => 'border:10%px solid;') array('style' => 'border:10%px solid;')
); );
}
function testPrependNewCSS() {
$this->assertResult( $this->assertResult(
array('border' => '23', 'style' => 'font-weight:bold;'), array('border' => '23', 'style' => 'font-weight:bold;'),
array('style' => 'border:23px solid;font-weight:bold;') array('style' => 'border:23px solid;font-weight:bold;')
); );
} }
} }

@ -6,38 +6,44 @@ require_once 'HTMLPurifier/AttrTransformHarness.php';
class HTMLPurifier_AttrTransform_EnumToCSSTest extends HTMLPurifier_AttrTransformHarness class HTMLPurifier_AttrTransform_EnumToCSSTest extends HTMLPurifier_AttrTransformHarness
{ {
function testRegular() { function setUp() {
parent::setUp();
$this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array( $this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array(
'left' => 'text-align:left;', 'left' => 'text-align:left;',
'right' => 'text-align:right;' 'right' => 'text-align:right;'
)); ));
}
// leave empty arrays alone
function testEmptyInput() {
$this->assertResult( array() ); $this->assertResult( array() );
}
// leave arrays without interesting stuff alone
function testPreserveArraysWithoutInterestingAttributes() {
$this->assertResult( array('style' => 'font-weight:bold;') ); $this->assertResult( array('style' => 'font-weight:bold;') );
}
// test each of the conversions
function testConvertAlignLeft() {
$this->assertResult( $this->assertResult(
array('align' => 'left'), array('align' => 'left'),
array('style' => 'text-align:left;') array('style' => 'text-align:left;')
); );
}
function testConvertAlignRight() {
$this->assertResult( $this->assertResult(
array('align' => 'right'), array('align' => 'right'),
array('style' => 'text-align:right;') array('style' => 'text-align:right;')
); );
}
// drop garbage value
function testRemoveInvalidAlign() {
$this->assertResult( $this->assertResult(
array('align' => 'invalid'), array('align' => 'invalid'),
array() array()
); );
}
// test CSS munging
function testPrependNewCSS() {
$this->assertResult( $this->assertResult(
array('align' => 'left', 'style' => 'font-weight:bold;'), array('align' => 'left', 'style' => 'font-weight:bold;'),
array('style' => 'text-align:left;font-weight:bold;') array('style' => 'text-align:left;font-weight:bold;')
@ -46,31 +52,23 @@ class HTMLPurifier_AttrTransform_EnumToCSSTest extends HTMLPurifier_AttrTransfor
} }
function testCaseInsensitive() { function testCaseInsensitive() {
$this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array( $this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array(
'right' => 'text-align:right;' 'right' => 'text-align:right;'
)); ));
// test case insensitivity
$this->assertResult( $this->assertResult(
array('align' => 'RIGHT'), array('align' => 'RIGHT'),
array('style' => 'text-align:right;') array('style' => 'text-align:right;')
); );
} }
function testCaseSensitive() { function testCaseSensitive() {
$this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array( $this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array(
'right' => 'text-align:right;' 'right' => 'text-align:right;'
), true); ), true);
// test case insensitivity
$this->assertResult( $this->assertResult(
array('align' => 'RIGHT'), array('align' => 'RIGHT'),
array() array()
); );
} }
} }

@ -11,39 +11,37 @@ class HTMLPurifier_AttrTransform_ImgRequiredTest extends HTMLPurifier_AttrTransf
$this->obj = new HTMLPurifier_AttrTransform_ImgRequired(); $this->obj = new HTMLPurifier_AttrTransform_ImgRequired();
} }
function test() { function testAddMissingAttr() {
$this->config->set('Core', 'RemoveInvalidImg', false);
$this->assertResult( $this->assertResult(
array(), array(),
array('src' => '', 'alt' => 'Invalid image'), array('src' => '', 'alt' => 'Invalid image')
array(
'Core.RemoveInvalidImg' => false
)
); );
}
function testAlternateDefaults() {
$this->config->set('Attr', 'DefaultInvalidImage', 'blank.png');
$this->config->set('Attr', 'DefaultInvalidImageAlt', 'Pawned!');
$this->config->set('Core', 'RemoveInvalidImg', false);
$this->assertResult( $this->assertResult(
array(), array(),
array('src' => 'blank.png', 'alt' => 'Pawned!'), array('src' => 'blank.png', 'alt' => 'Pawned!')
array(
'Attr.DefaultInvalidImage' => 'blank.png',
'Attr.DefaultInvalidImageAlt' => 'Pawned!',
'Core.RemoveInvalidImg' => false
)
); );
}
function testGenerateAlt() {
$this->assertResult( $this->assertResult(
array('src' => '/path/to/foobar.png'), array('src' => '/path/to/foobar.png'),
array('src' => '/path/to/foobar.png', 'alt' => 'foobar.png') array('src' => '/path/to/foobar.png', 'alt' => 'foobar.png')
); );
}
function testAddDefaultSrc() {
$this->config->set('Core', 'RemoveInvalidImg', false);
$this->assertResult( $this->assertResult(
array('alt' => 'intrigue'), array('alt' => 'intrigue'),
array('alt' => 'intrigue', 'src' => ''), array('alt' => 'intrigue', 'src' => '')
array(
'Core.RemoveInvalidImg' => false
)
); );
} }
} }

@ -9,33 +9,35 @@ class HTMLPurifier_AttrTransform_ImgSpaceTest extends HTMLPurifier_AttrTransform
function setUp() { function setUp() {
parent::setUp(); parent::setUp();
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('vspace');
} }
function testVertical() { function testEmptyInput() {
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('vspace');
$this->assertResult( array() ); $this->assertResult( array() );
}
function testVerticalBasicUsage() {
$this->assertResult( $this->assertResult(
array('vspace' => '1'), array('vspace' => '1'),
array('style' => 'margin-top:1px;margin-bottom:1px;') array('style' => 'margin-top:1px;margin-bottom:1px;')
); );
}
// no validation done here, we expect CSS validator to catch it
function testLenientHandlingOfInvalidInput() {
$this->assertResult( $this->assertResult(
array('vspace' => '10%'), array('vspace' => '10%'),
array('style' => 'margin-top:10%px;margin-bottom:10%px;') array('style' => 'margin-top:10%px;margin-bottom:10%px;')
); );
}
function testPrependNewCSS() {
$this->assertResult( $this->assertResult(
array('vspace' => '23', 'style' => 'font-weight:bold;'), array('vspace' => '23', 'style' => 'font-weight:bold;'),
array('style' => 'margin-top:23px;margin-bottom:23px;font-weight:bold;') array('style' => 'margin-top:23px;margin-bottom:23px;font-weight:bold;')
); );
} }
function testHorizontal() { function testHorizontalBasicUsage() {
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('hspace'); $this->obj = new HTMLPurifier_AttrTransform_ImgSpace('hspace');
$this->assertResult( $this->assertResult(
array('hspace' => '1'), array('hspace' => '1'),
@ -43,7 +45,7 @@ class HTMLPurifier_AttrTransform_ImgSpaceTest extends HTMLPurifier_AttrTransform
); );
} }
function testInvalid() { function testInvalidConstructionParameter() {
$this->expectError('ispace is not valid space attribute'); $this->expectError('ispace is not valid space attribute');
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('ispace'); $this->obj = new HTMLPurifier_AttrTransform_ImgSpace('ispace');
$this->assertResult( $this->assertResult(

@ -13,35 +13,36 @@ class HTMLPurifier_AttrTransform_LangTest
$this->obj = new HTMLPurifier_AttrTransform_Lang(); $this->obj = new HTMLPurifier_AttrTransform_Lang();
} }
function test() { function testEmptyInput() {
$this->assertResult(array());
// leave non-lang'ed elements alone }
$this->assertResult(array(), true);
function testCopyLangToXMLLang() {
// copy lang to xml:lang
$this->assertResult( $this->assertResult(
array('lang' => 'en'), array('lang' => 'en'),
array('lang' => 'en', 'xml:lang' => 'en') array('lang' => 'en', 'xml:lang' => 'en')
); );
}
// preserve attributes
function testPreserveAttributes() {
$this->assertResult( $this->assertResult(
array('src' => 'vert.png', 'lang' => 'fr'), array('src' => 'vert.png', 'lang' => 'fr'),
array('src' => 'vert.png', 'lang' => 'fr', 'xml:lang' => 'fr') array('src' => 'vert.png', 'lang' => 'fr', 'xml:lang' => 'fr')
); );
}
// copy xml:lang to lang
function testCopyXMLLangToLang() {
$this->assertResult( $this->assertResult(
array('xml:lang' => 'en'), array('xml:lang' => 'en'),
array('xml:lang' => 'en', 'lang' => 'en') array('xml:lang' => 'en', 'lang' => 'en')
); );
}
// both set, override lang with xml:lang
function testXMLLangOverridesLang() {
$this->assertResult( $this->assertResult(
array('lang' => 'fr', 'xml:lang' => 'de'), array('lang' => 'fr', 'xml:lang' => 'de'),
array('lang' => 'de', 'xml:lang' => 'de') array('lang' => 'de', 'xml:lang' => 'de')
); );
} }
} }

@ -11,21 +11,32 @@ class HTMLPurifier_AttrTransform_LengthTest extends HTMLPurifier_AttrTransformHa
$this->obj = new HTMLPurifier_AttrTransform_Length('width'); $this->obj = new HTMLPurifier_AttrTransform_Length('width');
} }
function test() { function testEmptyInput() {
$this->assertResult( array() ); $this->assertResult( array() );
}
function testTransformPixel() {
$this->assertResult( $this->assertResult(
array('width' => '10'), array('width' => '10'),
array('style' => 'width:10px;') array('style' => 'width:10px;')
); );
}
function testTransformPercentage() {
$this->assertResult( $this->assertResult(
array('width' => '10%'), array('width' => '10%'),
array('style' => 'width:10%;') array('style' => 'width:10%;')
); );
}
function testPrependNewCSS() {
$this->assertResult( $this->assertResult(
array('width' => '10%', 'style' => 'font-weight:bold'), array('width' => '10%', 'style' => 'font-weight:bold'),
array('style' => 'width:10%;font-weight:bold') array('style' => 'width:10%;font-weight:bold')
); );
// this behavior might change }
function testLenientTreatmentOfInvalidInput() {
$this->assertResult( $this->assertResult(
array('width' => 'asdf'), array('width' => 'asdf'),
array('style' => 'width:asdf;') array('style' => 'width:asdf;')

@ -11,12 +11,18 @@ class HTMLPurifier_AttrTransform_NameTest extends HTMLPurifier_AttrTransformHarn
$this->obj = new HTMLPurifier_AttrTransform_Name(); $this->obj = new HTMLPurifier_AttrTransform_Name();
} }
function test() { function testEmpty() {
$this->assertResult( array() ); $this->assertResult( array() );
}
function testTransformNameToID() {
$this->assertResult( $this->assertResult(
array('name' => 'free'), array('name' => 'free'),
array('id' => 'free') array('id' => 'free')
); );
}
function testExistingIDOverridesName() {
$this->assertResult( $this->assertResult(
array('name' => 'tryit', 'id' => 'tobad'), array('name' => 'tryit', 'id' => 'tobad'),
array('id' => 'tobad') array('id' => 'tobad')

@ -6,6 +6,7 @@ class HTMLPurifier_AttrTransformHarness extends HTMLPurifier_ComplexHarness
{ {
function setUp() { function setUp() {
parent::setUp();
$this->func = 'transform'; $this->func = 'transform';
} }

@ -35,7 +35,7 @@ class HTMLPurifier_AttrValidator_ErrorsTest extends HTMLPurifier_ErrorsHarness
$this->invoke($token); $this->invoke($token);
} }
// to lazy to check for global post and global pre // too lazy to check for global post and global pre
function testAttributeRemoved() { function testAttributeRemoved() {
$this->expectErrorCollection(E_ERROR, 'AttrValidator: Attribute removed'); $this->expectErrorCollection(E_ERROR, 'AttrValidator: Attribute removed');

@ -6,28 +6,36 @@ require_once 'HTMLPurifier/ChildDef/Chameleon.php';
class HTMLPurifier_ChildDef_ChameleonTest extends HTMLPurifier_ChildDefHarness class HTMLPurifier_ChildDef_ChameleonTest extends HTMLPurifier_ChildDefHarness
{ {
function test() { var $isInline;
function setUp() {
parent::setUp();
$this->obj = new HTMLPurifier_ChildDef_Chameleon( $this->obj = new HTMLPurifier_ChildDef_Chameleon(
'b | i', // allowed only when in inline context 'b | i', // allowed only when in inline context
'b | i | div' // allowed only when in block context 'b | i | div' // allowed only when in block context
); );
$this->context->register('IsInline', $this->isInline);
}
function testInlineAlwaysAllowed() {
$this->isInline = true;
$this->assertResult( $this->assertResult(
'<b>Allowed.</b>', true, '<b>Allowed.</b>'
array(), array('IsInline' => true)
); );
}
function testBlockNotAllowedInInline() {
$this->isInline = true;
$this->assertResult( $this->assertResult(
'<div>Not allowed.</div>', '', '<div>Not allowed.</div>', ''
array(), array('IsInline' => true)
); );
}
function testBlockAllowedInNonInline() {
$this->isInline = false;
$this->assertResult( $this->assertResult(
'<div>Allowed.</div>', true, '<div>Allowed.</div>'
array(), array('IsInline' => false)
); );
} }
} }

@ -6,13 +6,17 @@ require_once 'HTMLPurifier/ChildDef/Optional.php';
class HTMLPurifier_ChildDef_OptionalTest extends HTMLPurifier_ChildDefHarness class HTMLPurifier_ChildDef_OptionalTest extends HTMLPurifier_ChildDefHarness
{ {
function test() { function setUp() {
parent::setUp();
$this->obj = new HTMLPurifier_ChildDef_Optional('b | i'); $this->obj = new HTMLPurifier_ChildDef_Optional('b | i');
}
function testBasicUsage() {
$this->assertResult('<b>Bold text</b><img />', '<b>Bold text</b>'); $this->assertResult('<b>Bold text</b><img />', '<b>Bold text</b>');
}
function testRemoveForbiddenText() {
$this->assertResult('Not allowed text', ''); $this->assertResult('Not allowed text', '');
} }
} }

@ -6,8 +6,7 @@ require_once 'HTMLPurifier/ChildDef/Required.php';
class HTMLPurifier_ChildDef_RequiredTest extends HTMLPurifier_ChildDefHarness class HTMLPurifier_ChildDef_RequiredTest extends HTMLPurifier_ChildDefHarness
{ {
function testParsing() { function testPrepareString() {
$def = new HTMLPurifier_ChildDef_Required('foobar | bang |gizmo'); $def = new HTMLPurifier_ChildDef_Required('foobar | bang |gizmo');
$this->assertIdentical($def->elements, $this->assertIdentical($def->elements,
array( array(
@ -15,51 +14,61 @@ class HTMLPurifier_ChildDef_RequiredTest extends HTMLPurifier_ChildDefHarness
,'bang' => true ,'bang' => true
,'gizmo' => true ,'gizmo' => true
)); ));
}
function testPrepareArray() {
$def = new HTMLPurifier_ChildDef_Required(array('href', 'src')); $def = new HTMLPurifier_ChildDef_Required(array('href', 'src'));
$this->assertIdentical($def->elements, $this->assertIdentical($def->elements,
array( array(
'href' => true 'href' => true
,'src' => true ,'src' => true
)); ));
} }
function testPCDATAForbidden() { function setUp() {
parent::setUp();
$this->obj = new HTMLPurifier_ChildDef_Required('dt | dd'); $this->obj = new HTMLPurifier_ChildDef_Required('dt | dd');
}
function testEmptyInput() {
$this->assertResult('', false); $this->assertResult('', false);
}
function testRemoveIllegalTagsAndElements() {
$this->assertResult( $this->assertResult(
'<dt>Term</dt>Text in an illegal location'. '<dt>Term</dt>Text in an illegal location'.
'<dd>Definition</dd><b>Illegal tag</b>', '<dd>Definition</dd><b>Illegal tag</b>',
'<dt>Term</dt><dd>Definition</dd>'); '<dt>Term</dt><dd>Definition</dd>');
$this->assertResult('How do you do!', false); $this->assertResult('How do you do!', false);
}
function testIgnoreWhitespace() {
// whitespace shouldn't trigger it // whitespace shouldn't trigger it
$this->assertResult("\n<dd>Definition</dd> "); $this->assertResult("\n<dd>Definition</dd> ");
}
function testPreserveWhitespaceAfterRemoval() {
$this->assertResult( $this->assertResult(
'<dd>Definition</dd> <b></b> ', '<dd>Definition</dd> <b></b> ',
'<dd>Definition</dd> ' '<dd>Definition</dd> '
); );
}
function testDeleteNodeIfOnlyWhitespace() {
$this->assertResult("\t ", false); $this->assertResult("\t ", false);
} }
function testPCDATAAllowed() { function testPCDATAAllowed() {
$this->obj = new HTMLPurifier_ChildDef_Required('#PCDATA | b'); $this->obj = new HTMLPurifier_ChildDef_Required('#PCDATA | b');
$this->assertResult('Out <b>Bold text</b><img />', 'Out <b>Bold text</b>');
$this->assertResult('<b>Bold text</b><img />', '<b>Bold text</b>'); }
// with child escaping on function testPCDATAAllowedWithEscaping() {
$this->obj = new HTMLPurifier_ChildDef_Required('#PCDATA | b');
$this->config->set('Core', 'EscapeInvalidChildren', true);
$this->assertResult( $this->assertResult(
'<b>Bold text</b><img />', 'Out <b>Bold text</b><img />',
'<b>Bold text</b>&lt;img /&gt;', 'Out <b>Bold text</b>&lt;img /&gt;'
array(
'Core.EscapeInvalidChildren' => true
)
); );
} }

@ -7,48 +7,77 @@ class HTMLPurifier_ChildDef_StrictBlockquoteTest
extends HTMLPurifier_ChildDefHarness extends HTMLPurifier_ChildDefHarness
{ {
function test() { function setUp() {
parent::setUp();
$this->obj = new HTMLPurifier_ChildDef_StrictBlockquote('div | p'); $this->obj = new HTMLPurifier_ChildDef_StrictBlockquote('div | p');
}
// assuming default wrap is p
function testEmptyInput() {
$this->assertResult(''); $this->assertResult('');
}
function testPreserveValidP() {
$this->assertResult('<p>Valid</p>'); $this->assertResult('<p>Valid</p>');
}
function testPreserveValidDiv() {
$this->assertResult('<div>Still valid</div>'); $this->assertResult('<div>Still valid</div>');
}
function testWrapTextWithP() {
$this->assertResult('Needs wrap', '<p>Needs wrap</p>'); $this->assertResult('Needs wrap', '<p>Needs wrap</p>');
}
function testNoWrapForWhitespaceOrValidElements() {
$this->assertResult('<p>Do not wrap</p> <p>Whitespace</p>'); $this->assertResult('<p>Do not wrap</p> <p>Whitespace</p>');
}
function testWrapTextNextToValidElements() {
$this->assertResult( $this->assertResult(
'Wrap'. '<p>Do not wrap</p>', 'Wrap'. '<p>Do not wrap</p>',
'<p>Wrap</p><p>Do not wrap</p>' '<p>Wrap</p><p>Do not wrap</p>'
); );
}
function testWrapInlineElements() {
$this->assertResult( $this->assertResult(
'<p>Do not</p>'.'<b>Wrap</b>', '<p>Do not</p>'.'<b>Wrap</b>',
'<p>Do not</p><p><b>Wrap</b></p>' '<p>Do not</p><p><b>Wrap</b></p>'
); );
}
function testWrapAndRemoveInvalidTags() {
$this->assertResult( $this->assertResult(
'<li>Not allowed</li>Paragraph.<p>Hmm.</p>', '<li>Not allowed</li>Paragraph.<p>Hmm.</p>',
'<p>Not allowedParagraph.</p><p>Hmm.</p>' '<p>Not allowedParagraph.</p><p>Hmm.</p>'
); );
}
function testWrapComplicatedSring() {
$this->assertResult( $this->assertResult(
$var = 'He said<br />perhaps<br />we should <b>nuke</b> them.', $var = 'He said<br />perhaps<br />we should <b>nuke</b> them.',
"<p>$var</p>" "<p>$var</p>"
); );
}
function testWrapAndRemoveInvalidTagsComplex() {
$this->assertResult( $this->assertResult(
'<foo>Bar</foo><bas /><b>People</b>Conniving.'. '<p>Fools!</p>', '<foo>Bar</foo><bas /><b>People</b>Conniving.'. '<p>Fools!</p>',
'<p>Bar'. '<b>People</b>Conniving.</p><p>Fools!</p>' '<p>Bar'. '<b>People</b>Conniving.</p><p>Fools!</p>'
); );
}
$this->assertResult('Needs wrap', '<div>Needs wrap</div>',
array('HTML.BlockWrapper' => 'div')); function testAlternateWrapper() {
$this->config->set('HTML', 'BlockWrapper', 'div');
$this->assertResult('Needs wrap', '<div>Needs wrap</div>');
} }
function testError() { function testError() {
$this->expectError('Cannot use non-block element as block wrapper');
$this->obj = new HTMLPurifier_ChildDef_StrictBlockquote('div | p'); $this->obj = new HTMLPurifier_ChildDef_StrictBlockquote('div | p');
$this->assertResult('Needs wrap', '<p>Needs wrap</p>', $this->config->set('HTML', 'BlockWrapper', 'dav');
array('HTML.BlockWrapper' => 'dav')); $this->assertResult('Needs wrap', '<p>Needs wrap</p>');
$this->swallowErrors();
} }
} }

@ -3,46 +3,58 @@
require_once 'HTMLPurifier/ChildDefHarness.php'; require_once 'HTMLPurifier/ChildDefHarness.php';
require_once 'HTMLPurifier/ChildDef/Table.php'; require_once 'HTMLPurifier/ChildDef/Table.php';
// we're using empty tags to compact the tests: under real circumstances
// there would be contents in them
class HTMLPurifier_ChildDef_TableTest extends HTMLPurifier_ChildDefHarness class HTMLPurifier_ChildDef_TableTest extends HTMLPurifier_ChildDefHarness
{ {
function test() { function setUp() {
parent::setUp();
$this->obj = new HTMLPurifier_ChildDef_Table(); $this->obj = new HTMLPurifier_ChildDef_Table();
}
function testEmptyInput() {
$this->assertResult('', false); $this->assertResult('', false);
}
// we're using empty tags to compact the tests: under real circumstances
// there would be contents in them function testSingleRow() {
$this->assertResult('<tr />'); $this->assertResult('<tr />');
}
function testComplexContents() {
$this->assertResult('<caption /><col /><thead /><tfoot /><tbody>'. $this->assertResult('<caption /><col /><thead /><tfoot /><tbody>'.
'<tr><td>asdf</td></tr></tbody>'); '<tr><td>asdf</td></tr></tbody>');
$this->assertResult('<col /><col /><col /><tr />'); $this->assertResult('<col /><col /><col /><tr />');
}
// mixed up order
function testReorderContents() {
$this->assertResult( $this->assertResult(
'<col /><colgroup /><tbody /><tfoot /><thead /><tr>1</tr><caption /><tr />', '<col /><colgroup /><tbody /><tfoot /><thead /><tr>1</tr><caption /><tr />',
'<caption /><col /><colgroup /><thead /><tfoot /><tbody /><tr>1</tr><tr />'); '<caption /><col /><colgroup /><thead /><tfoot /><tbody /><tr>1</tr><tr />');
}
// duplicates of singles
// - first caption serves function testDuplicateProcessing() {
// - trailing tfoots/theads get turned into tbodys
$this->assertResult( $this->assertResult(
'<caption>1</caption><caption /><tbody /><tbody /><tfoot>1</tfoot><tfoot />', '<caption>1</caption><caption /><tbody /><tbody /><tfoot>1</tfoot><tfoot />',
'<caption>1</caption><tfoot>1</tfoot><tbody /><tbody /><tbody />' '<caption>1</caption><tfoot>1</tfoot><tbody /><tbody /><tbody />'
); );
}
// errant text dropped (until bubbling is implemented)
function testRemoveText() {
$this->assertResult('foo', false); $this->assertResult('foo', false);
}
// whitespace sticks to the previous element, last whitespace is
// stationary function testStickyWhitespaceOnTr() {
$this->assertResult("\n <tr />\n <tr />\n ", true, array('Output.Newline' => "\n")); $this->config->set('Output', 'Newline', "\n");
$this->assertResult("\n <tr />\n <tr />\n ");
}
function testStickyWhitespaceOnTSection() {
$this->config->set('Output', 'Newline', "\n");
$this->assertResult( $this->assertResult(
"\n\t<tbody />\n\t\t<tfoot />\n\t\t\t", "\n\t<tbody />\n\t\t<tfoot />\n\t\t\t",
"\n\t\t<tfoot />\n\t<tbody />\n\t\t\t", "\n\t\t<tfoot />\n\t<tbody />\n\t\t\t"
array('Output.Newline' => "\n")
); );
} }

@ -7,6 +7,7 @@ class HTMLPurifier_ChildDefHarness extends HTMLPurifier_ComplexHarness
{ {
function setUp() { function setUp() {
parent::setUp();
$this->obj = null; $this->obj = null;
$this->func = 'validateChildren'; $this->func = 'validateChildren';
$this->to_tokens = true; $this->to_tokens = true;

@ -67,41 +67,20 @@ class HTMLPurifier_ComplexHarness extends HTMLPurifier_Harness
* @param $context_array Context array in form of Key => Value or an actual * @param $context_array Context array in form of Key => Value or an actual
* context object. * context object.
*/ */
function assertResult($input, $expect = true, function assertResult($input, $expect = true) {
$config_array = array(), $context_array = array()
) {
// setup config
if ($this->config) {
$config = HTMLPurifier_Config::create($this->config);
$config->autoFinalize = false;
$config->loadArray($config_array);
} else {
$config = HTMLPurifier_Config::create($config_array);
}
// setup context object. Note that we are operating on a copy of it!
// When necessary, extend the test harness to allow post-tests
// on the context object
if (empty($this->context)) {
$context = new HTMLPurifier_Context();
$context->loadArray($context_array);
} else {
$context =& $this->context;
}
if ($this->to_tokens && is_string($input)) { if ($this->to_tokens && is_string($input)) {
// $func may cause $input to change, so "clone" another copy // $func may cause $input to change, so "clone" another copy
// to sacrifice // to sacrifice
$input = $this->lexer->tokenizeHTML($s = $input, $config, $context); $input = $this->tokenize($temp = $input);
$input_c = $this->lexer->tokenizeHTML($s, $config, $context); $input_c = $this->tokenize($temp);
} else { } else {
$input_c = $input; $input_c = $input;
} }
// call the function // call the function
$func = $this->func; $func = $this->func;
$result = $this->obj->$func($input_c, $config, $context); $result = $this->obj->$func($input_c, $this->config, $this->context);
// test a bool result // test a bool result
if (is_bool($result)) { if (is_bool($result)) {
@ -112,11 +91,9 @@ class HTMLPurifier_ComplexHarness extends HTMLPurifier_Harness
} }
if ($this->to_html) { if ($this->to_html) {
$result = $this->generator-> $result = $this->generate($result);
generateFromTokens($result, $config, $context);
if (is_array($expect)) { if (is_array($expect)) {
$expect = $this->generator-> $expect = $this->generate($expect);
generateFromTokens($expect, $config, $context);
} }
} }
@ -124,6 +101,20 @@ class HTMLPurifier_ComplexHarness extends HTMLPurifier_Harness
} }
/**
* Tokenize HTML into tokens, uses member variables for common variables
*/
function tokenize($html) {
return $this->lexer->tokenizeHTML($html, $this->config, $this->context);
}
/**
* Generate textual HTML from tokens
*/
function generate($tokens) {
return $this->generator->generateFromTokens($tokens, $this->config, $this->context);
}
} }

@ -17,7 +17,7 @@ class HTMLPurifier_EntityLookupTest extends HTMLPurifier_Harness
// special char // special char
$this->assertIdentical('"', $lookup->table['quot']); $this->assertIdentical('"', $lookup->table['quot']);
$this->assertIdentical('“', $lookup->table['ldquo']); $this->assertIdentical('“', $lookup->table['ldquo']);
$this->assertIdentical('<', $lookup->table['lt']); //expressed strangely $this->assertIdentical('<', $lookup->table['lt']); // expressed strangely in source file
// symbol char // symbol char
$this->assertIdentical('θ', $lookup->table['theta']); $this->assertIdentical('θ', $lookup->table['theta']);

@ -0,0 +1,39 @@
<?php
require_once 'HTMLPurifier/HTMLModuleHarness.php';
class HTMLPurifier_HTMLModule_ObjectTest extends HTMLPurifier_HTMLModuleHarness
{
function setUp() {
parent::setUp();
$this->config->set('HTML', 'Trusted', true);
}
function testDefaultRemoval() {
$this->config->set('HTML', 'Trusted', false);
$this->assertResult(
'<object></object>', ''
);
}
function testMinimal() {
$this->assertResult('<object></object>');
}
function testStandardUseCase() {
$this->assertResult(
'<object type="video/x-ms-wmv" data="http://domain.com/video.wmv" width="320" height="256">
<param name="src" value="http://domain.com/video.wmv" />
<param name="autostart" value="false" />
<param name="controller" value="true" />
<param name="pluginurl" value="http://www.microsoft.com/Windows/MediaPlayer/" />
<a href="http://www.microsoft.com/Windows/MediaPlayer/">Windows Media player required</a>
</object>'
);
}
// more test-cases?
}

@ -5,47 +5,51 @@ require_once 'HTMLPurifier/HTMLModuleHarness.php';
class HTMLPurifier_HTMLModule_ScriptingTest extends HTMLPurifier_HTMLModuleHarness class HTMLPurifier_HTMLModule_ScriptingTest extends HTMLPurifier_HTMLModuleHarness
{ {
function test() { function setUp() {
parent::setUp();
// default (remove everything) $this->config->set('HTML', 'Trusted', true);
$this->config->set('Core', 'CommentScriptContents', false);
}
function testDefaultRemoval() {
$this->config->set('HTML', 'Trusted', false);
$this->assertResult( $this->assertResult(
'<script type="text/javascript">foo();</script>', '' '<script type="text/javascript">foo();</script>', ''
); );
}
// enabled
function testPreserve() {
$this->assertResult( $this->assertResult(
'<script type="text/javascript">foo();</script>', true, '<script type="text/javascript">foo();</script>'
array('HTML.Trusted' => true)
); );
}
// CDATA
function testCDATAEnclosure() {
$this->assertResult( $this->assertResult(
'//<![CDATA[ '<script type="text/javascript">//<![CDATA[
alert("<This is compatible with XHTML>"); alert("<This is compatible with XHTML>");
//]]> ', true, //]]></script>'
array('HTML.Trusted' => true)
); );
}
// max
function testAllAttributes() {
$this->assertResult( $this->assertResult(
'<script '<script
defer="defer" defer="defer"
src="test.js" src="test.js"
type="text/javascript" type="text/javascript"
>PCDATA</script>', true, >PCDATA</script>'
array('HTML.Trusted' => true, 'Core.CommentScriptContents' => false)
); );
}
// unsupported
function testUnsupportedAttributes() {
$this->assertResult( $this->assertResult(
'<script '<script
type="text/javascript" type="text/javascript"
charset="utf-8" charset="utf-8"
>PCDATA</script>', >PCDATA</script>',
'<script type="text/javascript">PCDATA</script>', '<script type="text/javascript">PCDATA</script>'
array('HTML.Trusted' => true, 'Core.CommentScriptContents' => false)
); );
} }
} }

@ -8,29 +8,35 @@ class HTMLPurifier_Injector_AutoParagraphTest extends HTMLPurifier_InjectorHarne
function setup() { function setup() {
parent::setup(); parent::setup();
$this->config = array('AutoFormat.AutoParagraph' => true); $this->config->set('AutoFormat', 'AutoParagraph', true);
} }
function test() { function testSingleParagraph() {
$this->assertResult( $this->assertResult(
'Foobar', 'Foobar',
'<p>Foobar</p>' '<p>Foobar</p>'
); );
}
function testSingleMultiLineParagraph() {
$this->assertResult( $this->assertResult(
'Par 1 'Par 1
Par 1 still', Par 1 still',
'<p>Par 1 '<p>Par 1
Par 1 still</p>' Par 1 still</p>'
); );
}
function testTwoParagraphs() {
$this->assertResult( $this->assertResult(
'Par1 'Par1
Par2', Par2',
'<p>Par1</p><p>Par2</p>' '<p>Par1</p><p>Par2</p>'
); );
}
function testTwoParagraphsWithLotsOfSpace() {
$this->assertResult( $this->assertResult(
'Par1 'Par1
@ -39,15 +45,18 @@ Par2',
Par2', Par2',
'<p>Par1</p><p>Par2</p>' '<p>Par1</p><p>Par2</p>'
); );
}
function testTwoParagraphsWithInlineElements() {
$this->assertResult( $this->assertResult(
'<b>Par1</b> '<b>Par1</b>
<i>Par2</i>', <i>Par2</i>',
'<p><b>Par1</b></p><p><i>Par2</i></p>' '<p><b>Par1</b></p><p><i>Par2</i></p>'
); );
}
function testSingleParagraphThatLooksLikeTwo() {
$this->assertResult( $this->assertResult(
'<b>Par1 '<b>Par1
@ -56,29 +65,40 @@ Par2</b>',
Par2</b></p>' Par2</b></p>'
); );
}
function testAddParagraphAdjacentToParagraph() {
$this->assertResult( $this->assertResult(
'Par1<p>Par2</p>', 'Par1<p>Par2</p>',
'<p>Par1</p><p>Par2</p>' '<p>Par1</p><p>Par2</p>'
); );
}
function testParagraphUnclosedInlineElement() {
$this->assertResult( $this->assertResult(
'<b>Par1', '<b>Par1',
'<p><b>Par1</b></p>' '<p><b>Par1</b></p>'
); );
}
function testPreservePreTags() {
$this->assertResult( $this->assertResult(
'<pre>Par1 '<pre>Par1
Par1</pre>' Par1</pre>'
); );
}
function testIgnoreTrailingWhitespace() {
$this->assertResult( $this->assertResult(
'Par1 'Par1
', ',
'<p>Par1</p>' '<p>Par1</p>'
); );
}
function testDoNotParagraphBlockElements() {
$this->assertResult( $this->assertResult(
'Par1 'Par1
@ -87,19 +107,25 @@ Par1</pre>'
Par3', Par3',
'<p>Par1</p><div>Par2</div><p>Par3</p>' '<p>Par1</p><div>Par2</div><p>Par3</p>'
); );
}
function testParagraphTextAndInlineNodes() {
$this->assertResult( $this->assertResult(
'Par<b>1</b>', 'Par<b>1</b>',
'<p>Par<b>1</b></p>' '<p>Par<b>1</b></p>'
); );
}
function testIgnoreLeadingWhitespace() {
$this->assertResult( $this->assertResult(
' '
Par', Par',
'<p>Par</p>' '<p>Par</p>'
); );
}
function testIgnoreSurroundingWhitespace() {
$this->assertResult( $this->assertResult(
' '
@ -108,69 +134,90 @@ Par
', ',
'<p>Par</p>' '<p>Par</p>'
); );
}
function testParagraphInsideBlockNode() {
$this->assertResult( $this->assertResult(
'<div>Par1 '<div>Par1
Par2</div>', Par2</div>',
'<div><p>Par1</p><p>Par2</p></div>' '<div><p>Par1</p><p>Par2</p></div>'
); );
}
function testParagraphInlineNodeInsideBlockNode() {
$this->assertResult( $this->assertResult(
'<div><b>Par1</b> '<div><b>Par1</b>
Par2</div>', Par2</div>',
'<div><p><b>Par1</b></p><p>Par2</p></div>' '<div><p><b>Par1</b></p><p>Par2</p></div>'
); );
}
function testNoParagraphWhenOnlyOneInsideBlockNode() {
$this->assertResult('<div>Par1</div>'); $this->assertResult('<div>Par1</div>');
}
function testParagraphTwoInlineNodesInsideBlockNode() {
$this->assertResult( $this->assertResult(
'<div><b>Par1</b> '<div><b>Par1</b>
<i>Par2</i></div>', <i>Par2</i></div>',
'<div><p><b>Par1</b></p><p><i>Par2</i></p></div>' '<div><p><b>Par1</b></p><p><i>Par2</i></p></div>'
); );
}
function testPreserveInlineNodesInPreTag() {
$this->assertResult( $this->assertResult(
'<pre><b>Par1</b> '<pre><b>Par1</b>
<i>Par2</i></pre>', <i>Par2</i></pre>'
true
); );
}
function testSplitUpInternalsOfPTagInBlockNode() {
$this->assertResult( $this->assertResult(
'<div><p>Foo '<div><p>Foo
Bar</p></div>', Bar</p></div>',
'<div><p>Foo</p><p>Bar</p></div>' '<div><p>Foo</p><p>Bar</p></div>'
); );
}
function testSplitUpInlineNodesInPTagInBlockNode() {
$this->assertResult( $this->assertResult(
'<div><p><b>Foo</b> '<div><p><b>Foo</b>
<i>Bar</i></p></div>', <i>Bar</i></p></div>',
'<div><p><b>Foo</b></p><p><i>Bar</i></p></div>' '<div><p><b>Foo</b></p><p><i>Bar</i></p></div>'
); );
}
function testNoParagraphSingleInlineNodeInBlockNode() {
$this->assertResult( $this->assertResult(
'<div><b>Foo</b></div>', '<div><b>Foo</b></div>',
'<div><b>Foo</b></div>' '<div><b>Foo</b></div>'
); );
}
function testParagraphInBlockquote() {
$this->assertResult( $this->assertResult(
'<blockquote>Par1 '<blockquote>Par1
Par2</blockquote>', Par2</blockquote>',
'<blockquote><p>Par1</p><p>Par2</p></blockquote>' '<blockquote><p>Par1</p><p>Par2</p></blockquote>'
); );
}
function testNoParagraphBetweenListItem() {
$this->assertResult( $this->assertResult(
'<ul><li>Foo</li> '<ul><li>Foo</li>
<li>Bar</li></ul>', true <li>Bar</li></ul>'
); );
}
function testParagraphSingleElementWithSurroundingSpace() {
$this->assertResult( $this->assertResult(
'<div> '<div>
@ -179,7 +226,9 @@ Bar
</div>', </div>',
'<div><p>Bar</p></div>' '<div><p>Bar</p></div>'
); );
}
function testIgnoreExtraSpaceWithLeadingInlineNode() {
$this->assertResult( $this->assertResult(
'<b>Par1</b>a '<b>Par1</b>a
@ -188,99 +237,124 @@ Bar
Par2', Par2',
'<p><b>Par1</b>a</p><p>Par2</p>' '<p><b>Par1</b>a</p><p>Par2</p>'
); );
}
function testAbsorbExtraEndingPTag() {
$this->assertResult( $this->assertResult(
'Par1 'Par1
Par2</p>', Par2</p>',
'<p>Par1</p><p>Par2</p>' '<p>Par1</p><p>Par2</p>'
); );
}
function testAbsorbExtraEndingDivTag() {
$this->assertResult( $this->assertResult(
'Par1 'Par1
Par2</div>', Par2</div>',
'<p>Par1</p><p>Par2</p>' '<p>Par1</p><p>Par2</p>'
); );
}
function testDoNotParagraphSingleSurroundingSpaceInBlockNode() {
$this->assertResult( $this->assertResult(
'<div> '<div>
Par1 Par1
</div>', true </div>'
); );
}
function testBlockNodeTextDelimeterInBlockNode() {
$this->assertResult( $this->assertResult(
'<div>Par1 '<div>Par1
<div>Par2</div></div>', <div>Par2</div></div>',
'<div><p>Par1</p><div>Par2</div></div>' '<div><p>Par1</p><div>Par2</div></div>'
); );
}
function testBlockNodeTextDelimeterWithoutDoublespaceInBlockNode() {
$this->assertResult( $this->assertResult(
'<div>Par1 '<div>Par1
<div>Par2</div></div>', <div>Par2</div></div>',
'<div><p>Par1 '<div><p>Par1
</p><div>Par2</div></div>' </p><div>Par2</div></div>'
); );
}
function testBlockNodeTextDelimeterWithoutDoublespace() {
$this->assertResult( $this->assertResult(
'Par1 'Par1
<div>Par2</div>', <div>Par2</div>',
'<p>Par1 '<p>Par1
</p><div>Par2</div>' </p><div>Par2</div>'
); );
}
function testTwoParagraphsOfTextAndInlineNode() {
$this->assertResult( $this->assertResult(
'Par1 'Par1
<b>Par2</b>', <b>Par2</b>',
'<p>Par1</p><p><b>Par2</b></p>' '<p>Par1</p><p><b>Par2</b></p>'
); );
}
function testLeadingInlineNodeParagraph() {
$this->assertResult( $this->assertResult(
'<img /> Foo', '<img /> Foo',
'<p><img /> Foo</p>' '<p><img /> Foo</p>'
); );
}
function testTrailingInlineNodeParagraph() {
$this->assertResult( $this->assertResult(
'<li>Foo <a>bar</a></li>' '<li>Foo <a>bar</a></li>'
); );
}
function testTwoInlineNodeParagraph() {
$this->assertResult( $this->assertResult(
'<li><b>baz</b><a>bar</a></li>' '<li><b>baz</b><a>bar</a></li>'
); );
}
function testNoParagraphTrailingBlockNodeInBlockNode() {
$this->assertResult( $this->assertResult(
'<div><div>asdf</div><b>asdf</b></div>' '<div><div>asdf</div><b>asdf</b></div>'
); );
}
function testParagraphTrailingBlockNodeWithDoublespaceInBlockNode() {
$this->assertResult( $this->assertResult(
'<div><div>asdf</div> '<div><div>asdf</div>
<b>asdf</b></div>', <b>asdf</b></div>',
'<div><div>asdf</div><p><b>asdf</b></p></div>' '<div><div>asdf</div><p><b>asdf</b></p></div>'
); );
}
function testParagraphTwoInlineNodesAndWhitespaceNode() {
$this->assertResult( $this->assertResult(
'<b>One</b> <i>Two</i>', '<b>One</b> <i>Two</i>',
'<p><b>One</b> <i>Two</i></p>' '<p><b>One</b> <i>Two</i></p>'
); );
} }
function testInlineRootNode() { function testNoParagraphWithInlineRootNode() {
$this->config->set('HTML', 'Parent', 'span');
$this->assertResult( $this->assertResult(
'Par 'Par
Par2', Par2'
true,
array('AutoFormat.AutoParagraph' => true, 'HTML.Parent' => 'span')
); );
} }
function testNeeded() { function testErrorNeeded() {
$this->config->set('HTML', 'Allowed', 'b');
$this->expectError('Cannot enable AutoParagraph injector because p is not allowed'); $this->expectError('Cannot enable AutoParagraph injector because p is not allowed');
$this->assertResult('<b>foobar</b>', true, array('AutoFormat.AutoParagraph' => true, 'HTML.Allowed' => 'b')); $this->assertResult('<b>foobar</b>');
} }
} }

@ -8,35 +8,40 @@ class HTMLPurifier_Injector_LinkifyTest extends HTMLPurifier_InjectorHarness
function setup() { function setup() {
parent::setup(); parent::setup();
$this->config = array('AutoFormat.Linkify' => true); $this->config->set('AutoFormat', 'Linkify', true);
} }
function testLinkify() { function testLinkifyURLInRootNode() {
$this->assertResult( $this->assertResult(
'http://example.com', 'http://example.com',
'<a href="http://example.com">http://example.com</a>' '<a href="http://example.com">http://example.com</a>'
); );
}
function testLinkifyURLInInlineNode() {
$this->assertResult( $this->assertResult(
'<b>http://example.com</b>', '<b>http://example.com</b>',
'<b><a href="http://example.com">http://example.com</a></b>' '<b><a href="http://example.com">http://example.com</a></b>'
); );
}
function testBasicUsageCase() {
$this->assertResult( $this->assertResult(
'This URL http://example.com is what you need', 'This URL http://example.com is what you need',
'This URL <a href="http://example.com">http://example.com</a> is what you need' 'This URL <a href="http://example.com">http://example.com</a> is what you need'
); );
}
function testIgnoreURLInATag() {
$this->assertResult( $this->assertResult(
'<a>http://example.com/</a>' '<a>http://example.com/</a>'
); );
} }
function testNeeded() { function testNeeded() {
$this->config->set('HTML', 'Allowed', 'b');
$this->expectError('Cannot enable Linkify injector because a is not allowed'); $this->expectError('Cannot enable Linkify injector because a is not allowed');
$this->assertResult('http://example.com/', true, array('AutoFormat.Linkify' => true, 'HTML.Allowed' => 'b')); $this->assertResult('http://example.com/');
} }
} }

@ -8,39 +8,53 @@ class HTMLPurifier_Injector_PurifierLinkifyTest extends HTMLPurifier_InjectorHar
function setup() { function setup() {
parent::setup(); parent::setup();
$this->config = array( $this->config->set('AutoFormat', 'PurifierLinkify', true);
'AutoFormat.PurifierLinkify' => true, $this->config->set('AutoFormatParam', 'PurifierLinkifyDocURL', '#%s');
'AutoFormatParam.PurifierLinkifyDocURL' => '#%s'
);
} }
function testLinkify() { function testNoTriggerCharacer() {
$this->assertResult('Foobar'); $this->assertResult('Foobar');
}
function testTriggerCharacterInIrrelevantContext() {
$this->assertResult('20% off!'); $this->assertResult('20% off!');
}
function testPreserveNamespace() {
$this->assertResult('%Core namespace (not recognized)'); $this->assertResult('%Core namespace (not recognized)');
}
function testLinkifyBasic() {
$this->assertResult( $this->assertResult(
'%Namespace.Directive', '%Namespace.Directive',
'<a href="#Namespace.Directive">%Namespace.Directive</a>' '<a href="#Namespace.Directive">%Namespace.Directive</a>'
); );
}
function testLinkifyWithAdjacentTextNodes() {
$this->assertResult( $this->assertResult(
'This %Namespace.Directive thing', 'This %Namespace.Directive thing',
'This <a href="#Namespace.Directive">%Namespace.Directive</a> thing' 'This <a href="#Namespace.Directive">%Namespace.Directive</a> thing'
); );
}
function testLinkifyInBlock() {
$this->assertResult( $this->assertResult(
'<div>This %Namespace.Directive thing</div>', '<div>This %Namespace.Directive thing</div>',
'<div>This <a href="#Namespace.Directive">%Namespace.Directive</a> thing</div>' '<div>This <a href="#Namespace.Directive">%Namespace.Directive</a> thing</div>'
); );
}
function testPreserveInATag() {
$this->assertResult( $this->assertResult(
'<a>%Namespace.Directive</a>' '<a>%Namespace.Directive</a>'
); );
} }
function testNeeded() { function testNeeded() {
$this->config->set('HTML', 'Allowed', 'b');
$this->expectError('Cannot enable PurifierLinkify injector because a is not allowed'); $this->expectError('Cannot enable PurifierLinkify injector because a is not allowed');
$this->assertResult('%Namespace.Directive', true, array('AutoFormat.PurifierLinkify' => true, 'HTML.Allowed' => 'b')); $this->assertResult('%Namespace.Directive');
} }
} }

@ -5,70 +5,98 @@ require_once 'HTMLPurifier/Lexer/DirectLex.php';
class HTMLPurifier_LexerTest extends HTMLPurifier_Harness class HTMLPurifier_LexerTest extends HTMLPurifier_Harness
{ {
var $Lexer;
var $DirectLex, $PEARSax3, $DOMLex;
var $_entity_lookup;
var $_has_pear = false; var $_has_pear = false;
var $_has_dom = false;
function setUp() { function HTMLPurifier_LexerTest() {
$this->Lexer = new HTMLPurifier_Lexer(); parent::HTMLPurifier_Harness();
// E_STRICT = 2048, int used for PHP4 compat: this check disables
$this->DirectLex = new HTMLPurifier_Lexer_DirectLex(); // PEAR if PHP 5 strict mode is on, since the class is not strict safe
if (
if ( $GLOBALS['HTMLPurifierTest']['PEAR'] && $GLOBALS['HTMLPurifierTest']['PEAR'] &&
((error_reporting() & E_STRICT) != E_STRICT) ((error_reporting() & 2048) != 2048) // ought to be a better way
) { ) {
$this->_has_pear = true;
require_once 'HTMLPurifier/Lexer/PEARSax3.php'; require_once 'HTMLPurifier/Lexer/PEARSax3.php';
$this->PEARSax3 = new HTMLPurifier_Lexer_PEARSax3(); $this->_has_pear = true;
} }
if ($GLOBALS['HTMLPurifierTest']['PH5P']) {
$this->_has_dom = version_compare(PHP_VERSION, '5', '>='); require_once 'HTMLPurifier/Lexer/PH5P.php';
if ($this->_has_dom) {
require_once 'HTMLPurifier/Lexer/DOMLex.php';
$this->DOMLex = new HTMLPurifier_Lexer_DOMLex();
} }
$this->_entity_lookup = HTMLPurifier_EntityLookup::instance();
} }
// HTMLPurifier_Lexer::create() --------------------------------------------
function test_create() { function test_create() {
$config = HTMLPurifier_Config::create(array('Core.MaintainLineNumbers' => true)); $this->config->set('Core', 'MaintainLineNumbers', true);
$lexer = HTMLPurifier_Lexer::create($config); $lexer = HTMLPurifier_Lexer::create($this->config);
$this->assertIsA($lexer, 'HTMLPurifier_Lexer_DirectLex'); $this->assertIsA($lexer, 'HTMLPurifier_Lexer_DirectLex');
} }
// HTMLPurifier_Lexer->parseData() -----------------------------------------
function assertParseData($input, $expect = true) {
if ($expect === true) $expect = $input;
$lexer = new HTMLPurifier_Lexer();
$this->assertIdentical($expect, $lexer->parseData($input));
}
function test_parseData_plainText() {
$this->assertParseData('asdf');
}
function test_parseData_ampersandEntity() {
$this->assertParseData('&amp;', '&');
}
function test_parseData_quotEntity() {
$this->assertParseData('&quot;', '"');
}
function test_parseData_aposNumericEntity() {
$this->assertParseData('&#039;', "'");
}
function test_parseData_aposCompactNumericEntity() {
$this->assertParseData('&#39;', "'");
}
function test_parseData_adjacentAmpersandEntities() {
$this->assertParseData('&amp;&amp;&amp;', '&&&');
}
function test_parseData_trailingUnescapedAmpersand() {
$this->assertParseData('&amp;&', '&&');
}
function test_parseData_internalUnescapedAmpersand() {
$this->assertParseData('Procter & Gamble');
}
function test_parseData_improperEntityFaultToleranceTest() {
$this->assertParseData('&#x2D;');
}
// HTMLPurifier_Lexer->extractBody() ---------------------------------------
function assertExtractBody($text, $extract = true) { function assertExtractBody($text, $extract = true) {
$result = $this->Lexer->extractBody($text); $lexer = new HTMLPurifier_Lexer();
$result = $lexer->extractBody($text);
if ($extract === true) $extract = $text; if ($extract === true) $extract = $text;
$this->assertIdentical($extract, $result); $this->assertIdentical($extract, $result);
} }
function test_parseData() { function test_extractBody_noBodyTags() {
$HP =& $this->Lexer; $this->assertExtractBody('<b>Bold</b>');
$this->assertIdentical('asdf', $HP->parseData('asdf'));
$this->assertIdentical('&', $HP->parseData('&amp;'));
$this->assertIdentical('"', $HP->parseData('&quot;'));
$this->assertIdentical("'", $HP->parseData('&#039;'));
$this->assertIdentical("'", $HP->parseData('&#39;'));
$this->assertIdentical('&&&', $HP->parseData('&amp;&amp;&amp;'));
$this->assertIdentical('&&', $HP->parseData('&amp;&')); // [INVALID]
$this->assertIdentical('Procter & Gamble',
$HP->parseData('Procter & Gamble')); // [INVALID]
// This is not special, thus not converted. Test of fault tolerance,
// realistically speaking, this should never happen
$this->assertIdentical('&#x2D;', $HP->parseData('&#x2D;'));
} }
function test_extractBody_lowercaseBodyTags() {
function test_extractBody() {
$this->assertExtractBody('<b>Bold</b>');
$this->assertExtractBody('<html><body><b>Bold</b></body></html>', '<b>Bold</b>'); $this->assertExtractBody('<html><body><b>Bold</b></body></html>', '<b>Bold</b>');
}
function test_extractBody_uppercaseBodyTags() {
$this->assertExtractBody('<HTML><BODY><B>Bold</B></BODY></HTML>', '<B>Bold</B>'); $this->assertExtractBody('<HTML><BODY><B>Bold</B></BODY></HTML>', '<B>Bold</B>');
}
function test_extractBody_realisticUseCase() {
$this->assertExtractBody( $this->assertExtractBody(
'<?xml version="1.0" '<?xml version="1.0"
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
@ -96,303 +124,404 @@ class HTMLPurifier_LexerTest extends HTMLPurifier_Harness
</div> </div>
</form> </form>
'); ');
$this->assertExtractBody('<html><body bgcolor="#F00"><b>Bold</b></body></html>', '<b>Bold</b>');
$this->assertExtractBody('<body>asdf'); // not closed, don't accept
} }
function test_tokenizeHTML() { function test_extractBody_bodyWithAttributes() {
$this->assertExtractBody('<html><body bgcolor="#F00"><b>Bold</b></body></html>', '<b>Bold</b>');
$input = array();
$expect = array();
$sax_expect = array();
$config = array();
$input[0] = '';
$expect[0] = array();
$input[1] = 'This is regular text.';
$expect[1] = array(
new HTMLPurifier_Token_Text('This is regular text.')
);
$input[2] = 'This is <b>bold</b> text';
$expect[2] = array(
new HTMLPurifier_Token_Text('This is ')
,new HTMLPurifier_Token_Start('b', array())
,new HTMLPurifier_Token_Text('bold')
,new HTMLPurifier_Token_End('b')
,new HTMLPurifier_Token_Text(' text')
);
$input[3] = '<DIV>Totally rad dude. <b>asdf</b></div>';
$expect[3] = array(
new HTMLPurifier_Token_Start('DIV', array())
,new HTMLPurifier_Token_Text('Totally rad dude. ')
,new HTMLPurifier_Token_Start('b', array())
,new HTMLPurifier_Token_Text('asdf')
,new HTMLPurifier_Token_End('b')
,new HTMLPurifier_Token_End('div')
);
// [XML-INVALID]
$input[4] = '<asdf></asdf><d></d><poOloka><poolasdf><ds></asdf></ASDF>';
$expect[4] = array(
new HTMLPurifier_Token_Start('asdf')
,new HTMLPurifier_Token_End('asdf')
,new HTMLPurifier_Token_Start('d')
,new HTMLPurifier_Token_End('d')
,new HTMLPurifier_Token_Start('poOloka')
,new HTMLPurifier_Token_Start('poolasdf')
,new HTMLPurifier_Token_Start('ds')
,new HTMLPurifier_Token_End('asdf')
,new HTMLPurifier_Token_End('ASDF')
);
// DOM is different because it condenses empty tags into REAL empty ones
// as well as makes it well-formed
$dom_expect[4] = array(
new HTMLPurifier_Token_Empty('asdf')
,new HTMLPurifier_Token_Empty('d')
,new HTMLPurifier_Token_Start('pooloka')
,new HTMLPurifier_Token_Start('poolasdf')
,new HTMLPurifier_Token_Empty('ds')
,new HTMLPurifier_Token_End('poolasdf')
,new HTMLPurifier_Token_End('pooloka')
);
$input[5] = '<a'."\t".'href="foobar.php"'."\n".'title="foo!">Link to <b id="asdf">foobar</b></a>';
$expect[5] = array(
new HTMLPurifier_Token_Start('a',array('href'=>'foobar.php','title'=>'foo!'))
,new HTMLPurifier_Token_Text('Link to ')
,new HTMLPurifier_Token_Start('b',array('id'=>'asdf'))
,new HTMLPurifier_Token_Text('foobar')
,new HTMLPurifier_Token_End('b')
,new HTMLPurifier_Token_End('a')
);
$input[6] = '<br />';
$expect[6] = array(
new HTMLPurifier_Token_Empty('br')
);
// [SGML-INVALID] [RECOVERABLE]
$input[7] = '<!-- Comment --> <!-- not so well formed --->';
$expect[7] = array(
new HTMLPurifier_Token_Comment(' Comment ')
,new HTMLPurifier_Token_Text(' ')
,new HTMLPurifier_Token_Comment(' not so well formed -')
);
$sax_expect[7] = false; // we need to figure out proper comment output
// [SGML-INVALID]
$input[8] = '<a href=""';
$expect[8] = array(
new HTMLPurifier_Token_Text('<a href=""')
);
// SAX parses it into a tag
$sax_expect[8] = array(
new HTMLPurifier_Token_Start('a', array('href'=>''))
);
// DOM parses it into an empty tag
$dom_expect[8] = array(
new HTMLPurifier_Token_Empty('a', array('href'=>''))
);
$input[9] = '&lt;b&gt;';
$expect[9] = array(
new HTMLPurifier_Token_Text('<b>')
);
$sax_expect[9] = array(
new HTMLPurifier_Token_Text('<')
,new HTMLPurifier_Token_Text('b')
,new HTMLPurifier_Token_Text('>')
);
// note that SAX can clump text nodes together. We won't be
// too picky though
// [SGML-INVALID]
$input[10] = '<a "=>';
// We barf on this, aim for no attributes
$expect[10] = array(
new HTMLPurifier_Token_Start('a', array('"' => ''))
);
// DOM correctly has no attributes, but also closes the tag
$dom_expect[10] = array(
new HTMLPurifier_Token_Empty('a')
);
// SAX barfs on this
$sax_expect[10] = array(
new HTMLPurifier_Token_Start('a', array('"' => ''))
);
// [INVALID] [RECOVERABLE]
$input[11] = '"';
$expect[11] = array( new HTMLPurifier_Token_Text('"') );
// compare with this valid one:
$input[12] = '&quot;';
$expect[12] = array( new HTMLPurifier_Token_Text('"') );
$sax_expect[12] = false; // choked!
// CDATA sections!
$input[13] = '<![CDATA[You <b>can&#39;t</b> get me!]]>';
$expect[13] = array( new HTMLPurifier_Token_Text(
'You <b>can&#39;t</b> get me!' // raw
) );
$sax_expect[13] = array( // SAX has a seperate call for each entity
new HTMLPurifier_Token_Text('You '),
new HTMLPurifier_Token_Text('<'),
new HTMLPurifier_Token_Text('b'),
new HTMLPurifier_Token_Text('>'),
new HTMLPurifier_Token_Text('can'),
new HTMLPurifier_Token_Text('&'),
new HTMLPurifier_Token_Text('#39;t'),
new HTMLPurifier_Token_Text('<'),
new HTMLPurifier_Token_Text('/b'),
new HTMLPurifier_Token_Text('>'),
new HTMLPurifier_Token_Text(' get me!')
);
$char_theta = $this->_entity_lookup->table['theta'];
$char_rarr = $this->_entity_lookup->table['rarr'];
// test entity replacement
$input[14] = '&theta;';
$expect[14] = array( new HTMLPurifier_Token_Text($char_theta) );
// test that entities aren't replaced in CDATA sections
$input[15] = '&theta; <![CDATA[&rarr;]]>';
$expect[15] = array( new HTMLPurifier_Token_Text($char_theta . ' &rarr;') );
$sax_expect[15] = array(
new HTMLPurifier_Token_Text($char_theta . ' '),
new HTMLPurifier_Token_Text('&'),
new HTMLPurifier_Token_Text('rarr;')
);
// test entity resolution in attributes
$input[16] = '<a href="index.php?title=foo&amp;id=bar">Link</a>';
$expect[16] = array(
new HTMLPurifier_Token_Start('a',array('href' => 'index.php?title=foo&id=bar'))
,new HTMLPurifier_Token_Text('Link')
,new HTMLPurifier_Token_End('a')
);
// test that UTF-8 is preserved
$char_hearts = $this->_entity_lookup->table['hearts'];
$input[17] = $char_hearts;
$expect[17] = array( new HTMLPurifier_Token_Text($char_hearts) );
// test weird characters in attributes
$input[18] = '<br test="x &lt; 6" />';
$expect[18] = array( new HTMLPurifier_Token_Empty('br', array('test' => 'x < 6')) );
// test emoticon protection
$input[19] = '<b>Whoa! <3 That\'s not good >.></b>';
$expect[19] = array(
new HTMLPurifier_Token_Start('b'),
new HTMLPurifier_Token_Text('Whoa! '),
new HTMLPurifier_Token_Text('<3 That\'s not good >'),
new HTMLPurifier_Token_Text('.>'),
new HTMLPurifier_Token_End('b'),
);
$dom_expect[19] = array(
new HTMLPurifier_Token_Start('b'),
new HTMLPurifier_Token_Text('Whoa! <3 That\'s not good >.>'),
new HTMLPurifier_Token_End('b'),
);
$sax_expect[19] = false; // SAX drops the < character
$config[19] = HTMLPurifier_Config::create(array('Core.AggressivelyFixLt' => true));
// test comment parsing with funky characters inside
$input[20] = '<!-- This >< comment --><br />';
$expect[20] = array(
new HTMLPurifier_Token_Comment(' This >< comment '),
new HTMLPurifier_Token_Empty('br')
);
$sax_expect[20] = false;
$config[20] = HTMLPurifier_Config::create(array('Core.AggressivelyFixLt' => true));
// test comment parsing of missing end
$input[21] = '<!-- This >< comment';
$expect[21] = array(
new HTMLPurifier_Token_Comment(' This >< comment')
);
$sax_expect[21] = false;
$dom_expect[21] = false;
$config[21] = HTMLPurifier_Config::create(array('Core.AggressivelyFixLt' => true));
// test CDATA tags
$input[22] = '<script>alert("<foo>");</script>';
$expect[22] = array(
new HTMLPurifier_Token_Start('script')
,new HTMLPurifier_Token_Text('alert("<foo>");')
,new HTMLPurifier_Token_End('script')
);
$config[22] = HTMLPurifier_Config::create(array('HTML.Trusted' => true));
$sax_expect[22] = false;
// test escaping
$input[23] = '<!-- This comment < &lt; & -->';
$expect[23] = array(
new HTMLPurifier_Token_Comment(' This comment < &lt; & ') );
$sax_expect[23] = false; $config[23] =
HTMLPurifier_Config::create(array('Core.AggressivelyFixLt' =>
true));
// more DirectLex edge-cases
$input[24] = '<a href="><>">';
$expect[24] = array(
new HTMLPurifier_Token_Start('a', array('href' => '')),
new HTMLPurifier_Token_Text('<">')
);
$sax_expect[24] = false;
$dom_expect[24] = array(
new HTMLPurifier_Token_Empty('a', array('href' => '><>'))
);
$default_config = HTMLPurifier_Config::createDefault();
$default_context = new HTMLPurifier_Context();
foreach($input as $i => $discard) {
if (!isset($config[$i])) $config[$i] = $default_config;
$result = $this->DirectLex->tokenizeHTML($input[$i], $config[$i], $default_context);
$this->assertIdentical($expect[$i], $result, 'DirectLexTest '.$i.': %s');
paintIf($result, $expect[$i] != $result);
if ($this->_has_pear) {
// assert unless I say otherwise
$sax_result = $this->PEARSax3->tokenizeHTML($input[$i], $config[$i], $default_context);
if (!isset($sax_expect[$i])) {
// by default, assert with normal result
$this->assertIdentical($expect[$i], $sax_result, 'PEARSax3Test '.$i.': %s');
paintIf($sax_result, $expect[$i] != $sax_result);
} elseif ($sax_expect[$i] === false) {
// assertions were turned off, optionally dump
// paintIf($sax_expect, $i == NUMBER);
} else {
// match with a custom SAX result array
$this->assertIdentical($sax_expect[$i], $sax_result, 'PEARSax3Test (custom) '.$i.': %s');
paintIf($sax_result, $sax_expect[$i] != $sax_result);
}
}
if ($this->_has_dom) {
$dom_result = $this->DOMLex->tokenizeHTML($input[$i], $config[$i], $default_context);
// same structure as SAX
if (!isset($dom_expect[$i])) {
$this->assertIdentical($expect[$i], $dom_result, 'DOMLexTest '.$i.': %s');
paintIf($dom_result, $expect[$i] != $dom_result);
} elseif ($dom_expect[$i] === false) {
// paintIf($dom_result, $i == NUMBER);
} else {
$this->assertIdentical($dom_expect[$i], $dom_result, 'DOMLexTest (custom) '.$i.': %s');
paintIf($dom_result, $dom_expect[$i] != $dom_result);
}
}
}
} }
function test_extractBody_preserveUnclosedBody() {
$this->assertExtractBody('<body>asdf'); // not closed, don't accept
}
// HTMLPurifier_Lexer->tokenizeHTML() --------------------------------------
function assertTokenization($input, $expect, $alt_expect = array()) {
$lexers = array();
$lexers['DirectLex'] = new HTMLPurifier_Lexer_DirectLex();
if ($this->_has_pear) $lexers['PEARSax3'] = new HTMLPurifier_Lexer_PEARSax3();
if (version_compare(PHP_VERSION, "5", ">=") && class_exists('DOMDocument')) {
$lexers['DOMLex'] = new HTMLPurifier_Lexer_DOMLex();
$lexers['PH5P'] = new HTMLPurifier_Lexer_PH5P();
}
foreach ($lexers as $name => $lexer) {
$result = $lexer->tokenizeHTML($input, $this->config, $this->context);
if (isset($alt_expect[$name])) {
if ($alt_expect[$name] === false) continue;
$t_expect = $alt_expect[$name];
$this->assertIdentical($result, $alt_expect[$name], "$name: %s");
} else {
$t_expect = $expect;
$this->assertIdentical($result, $expect, "$name: %s");
}
if ($t_expect != $result) {
printTokens($result);
//var_dump($result);
}
}
}
function test_tokenizeHTML_emptyInput() {
$this->assertTokenization('', array());
}
function test_tokenizeHTML_plainText() {
$this->assertTokenization(
'This is regular text.',
array(
new HTMLPurifier_Token_Text('This is regular text.')
)
);
}
function test_tokenizeHTML_textAndTags() {
$this->assertTokenization(
'This is <b>bold</b> text',
array(
new HTMLPurifier_Token_Text('This is '),
new HTMLPurifier_Token_Start('b', array()),
new HTMLPurifier_Token_Text('bold'),
new HTMLPurifier_Token_End('b'),
new HTMLPurifier_Token_Text(' text'),
)
);
}
function test_tokenizeHTML_normalizeCase() {
$this->assertTokenization(
'<DIV>Totally rad dude. <b>asdf</b></div>',
array(
new HTMLPurifier_Token_Start('DIV', array()),
new HTMLPurifier_Token_Text('Totally rad dude. '),
new HTMLPurifier_Token_Start('b', array()),
new HTMLPurifier_Token_Text('asdf'),
new HTMLPurifier_Token_End('b'),
new HTMLPurifier_Token_End('div'),
)
);
}
function test_tokenizeHTML_notWellFormed() {
$this->assertTokenization(
'<asdf></asdf><d></d><poOloka><poolasdf><ds></asdf></ASDF>',
array(
new HTMLPurifier_Token_Start('asdf'),
new HTMLPurifier_Token_End('asdf'),
new HTMLPurifier_Token_Start('d'),
new HTMLPurifier_Token_End('d'),
new HTMLPurifier_Token_Start('poOloka'),
new HTMLPurifier_Token_Start('poolasdf'),
new HTMLPurifier_Token_Start('ds'),
new HTMLPurifier_Token_End('asdf'),
new HTMLPurifier_Token_End('ASDF'),
),
array(
'DOMLex' => $alt = array(
new HTMLPurifier_Token_Empty('asdf'),
new HTMLPurifier_Token_Empty('d'),
new HTMLPurifier_Token_Start('pooloka'),
new HTMLPurifier_Token_Start('poolasdf'),
new HTMLPurifier_Token_Empty('ds'),
new HTMLPurifier_Token_End('poolasdf'),
new HTMLPurifier_Token_End('pooloka'),
),
'PH5P' => $alt,
)
);
}
function test_tokenizeHTML_whitespaceInTag() {
$this->assertTokenization(
'<a'."\t".'href="foobar.php"'."\n".'title="foo!">Link to <b id="asdf">foobar</b></a>',
array(
new HTMLPurifier_Token_Start('a',array('href'=>'foobar.php','title'=>'foo!')),
new HTMLPurifier_Token_Text('Link to '),
new HTMLPurifier_Token_Start('b',array('id'=>'asdf')),
new HTMLPurifier_Token_Text('foobar'),
new HTMLPurifier_Token_End('b'),
new HTMLPurifier_Token_End('a'),
)
);
}
function test_tokenizeHTML_emptyTag() {
$this->assertTokenization(
'<br />',
array( new HTMLPurifier_Token_Empty('br') )
);
}
function test_tokenizeHTML_comment() {
$this->assertTokenization(
'<!-- Comment -->',
array( new HTMLPurifier_Token_Comment(' Comment ') ),
array(
'PEARSax3' => array( new HTMLPurifier_Token_Comment('-- Comment --') ),
)
);
}
function test_tokenizeHTML_malformedComment() {
$this->assertTokenization(
'<!-- not so well formed --->',
array( new HTMLPurifier_Token_Comment(' not so well formed -') ),
array(
'PEARSax3' => array( new HTMLPurifier_Token_Comment('-- not so well formed ---') ),
)
);
}
function test_tokenizeHTML_unterminatedTag() {
$this->assertTokenization(
'<a href=""',
array( new HTMLPurifier_Token_Text('<a href=""') ),
array(
// I like our behavior better, but it's non-standard
'DOMLex' => array( new HTMLPurifier_Token_Empty('a', array('href'=>'')) ),
'PEARSax3' => array( new HTMLPurifier_Token_Start('a', array('href'=>'')) ),
'PH5P' => false, // total barfing, grabs scaffolding too
)
);
}
function test_tokenizeHTML_specialEntities() {
$this->assertTokenization(
'&lt;b&gt;',
array(
new HTMLPurifier_Token_Text('<b>')
),
array(
// some parsers will separate entities out
'PEARSax3' => $split = array(
new HTMLPurifier_Token_Text('<'),
new HTMLPurifier_Token_Text('b'),
new HTMLPurifier_Token_Text('>'),
),
'PH5P' => $split,
)
);
}
function test_tokenizeHTML_earlyQuote() {
$this->assertTokenization(
'<a "=>',
array( new HTMLPurifier_Token_Empty('a') ),
array(
// we barf on this input
'DirectLex' => $tokens = array(
new HTMLPurifier_Token_Start('a', array('"' => ''))
),
'PEARSax3' => $tokens,
'PH5P' => array(
new HTMLPurifier_Token_Empty('a', array('"' => ''))
),
)
);
}
function test_tokenizeHTML_unescapedQuote() {
$this->assertTokenization(
'"',
array( new HTMLPurifier_Token_Text('"') )
);
}
function test_tokenizeHTML_escapedQuote() {
$this->assertTokenization(
'&quot;',
array( new HTMLPurifier_Token_Text('"') ),
array(
'PEARSax3' => false, // PEAR barfs on this
)
);
}
function test_tokenizeHTML_cdata() {
$this->assertTokenization(
'<![CDATA[You <b>can&#39;t</b> get me!]]>',
array( new HTMLPurifier_Token_Text('You <b>can&#39;t</b> get me!') ),
array(
// PEAR splits up all of the CDATA
'PEARSax3' => $split = array(
new HTMLPurifier_Token_Text('You '),
new HTMLPurifier_Token_Text('<'),
new HTMLPurifier_Token_Text('b'),
new HTMLPurifier_Token_Text('>'),
new HTMLPurifier_Token_Text('can'),
new HTMLPurifier_Token_Text('&'),
new HTMLPurifier_Token_Text('#39;t'),
new HTMLPurifier_Token_Text('<'),
new HTMLPurifier_Token_Text('/b'),
new HTMLPurifier_Token_Text('>'),
new HTMLPurifier_Token_Text(' get me!'),
),
'PH5P' => $split,
)
);
}
function test_tokenizeHTML_characterEntity() {
$this->assertTokenization(
'&theta;',
array( new HTMLPurifier_Token_Text("\xCE\xB8") )
);
}
function test_tokenizeHTML_characterEntityInCDATA() {
$this->assertTokenization(
'<![CDATA[&rarr;]]>',
array( new HTMLPurifier_Token_Text("&rarr;") ),
array(
'PEARSax3' => $split = array(
new HTMLPurifier_Token_Text('&'),
new HTMLPurifier_Token_Text('rarr;'),
),
'PH5P' => $split,
)
);
}
function test_tokenizeHTML_entityInAttribute() {
$this->assertTokenization(
'<a href="index.php?title=foo&amp;id=bar">Link</a>',
array(
new HTMLPurifier_Token_Start('a',array('href' => 'index.php?title=foo&id=bar')),
new HTMLPurifier_Token_Text('Link'),
new HTMLPurifier_Token_End('a'),
)
);
}
function test_tokenizeHTML_preserveUTF8() {
$this->assertTokenization(
"\xCE\xB8",
array( new HTMLPurifier_Token_Text("\xCE\xB8") )
);
}
function test_tokenizeHTML_specialEntityInAttribute() {
$this->assertTokenization(
'<br test="x &lt; 6" />',
array( new HTMLPurifier_Token_Empty('br', array('test' => 'x < 6')) )
);
}
function test_tokenizeHTML_emoticonProtection() {
$this->config->set('Core', 'AggressivelyFixLt', true);
$this->assertTokenization(
'<b>Whoa! <3 That\'s not good >.></b>',
array(
new HTMLPurifier_Token_Start('b'),
new HTMLPurifier_Token_Text('Whoa! '),
new HTMLPurifier_Token_Text('<3 That\'s not good >'),
new HTMLPurifier_Token_Text('.>'),
new HTMLPurifier_Token_End('b')
),
array(
// text is absorbed together
'DOMLex' => array(
new HTMLPurifier_Token_Start('b'),
new HTMLPurifier_Token_Text('Whoa! <3 That\'s not good >.>'),
new HTMLPurifier_Token_End('b'),
),
'PEARSax3' => false, // totally mangled
'PH5P' => array( // interesting grouping
new HTMLPurifier_Token_Start('b'),
new HTMLPurifier_Token_Text('Whoa! '),
new HTMLPurifier_Token_Text('<'),
new HTMLPurifier_Token_Text('3 That\'s not good >.>'),
new HTMLPurifier_Token_End('b'),
),
)
);
}
function test_tokenizeHTML_commentWithFunkyChars() {
$this->assertTokenization(
'<!-- This >< comment --><br />',
array(
new HTMLPurifier_Token_Comment(' This >< comment '),
new HTMLPurifier_Token_Empty('br'),
),
array(
'PEARSax3' => false,
)
);
}
function test_tokenizeHTML_unterminatedComment() {
$this->assertTokenization(
'<!-- This >< comment',
array( new HTMLPurifier_Token_Comment(' This >< comment') ),
array(
'DOMLex' => false,
'PEARSax3' => false,
'PH5P' => false,
)
);
}
function test_tokenizeHTML_scriptCDATAContents() {
$this->config->set('HTML', 'Trusted', true);
$this->assertTokenization(
'Foo: <script>alert("<foo>");</script>',
array(
new HTMLPurifier_Token_Text('Foo: '),
new HTMLPurifier_Token_Start('script'),
new HTMLPurifier_Token_Text('alert("<foo>");'),
new HTMLPurifier_Token_End('script'),
),
array(
'PEARSax3' => false,
// PH5P, for some reason, bubbles the script to <head>
'PH5P' => false,
)
);
}
function test_tokenizeHTML_entitiesInComment() {
$this->config->set('Core', 'AggressivelyFixLt', true);
$this->assertTokenization(
'<!-- This comment < &lt; & -->',
array( new HTMLPurifier_Token_Comment(' This comment < &lt; & ') ),
array(
'PEARSax3' => false
)
);
}
function test_tokenizeHTML_attributeWithSpecialCharacters() {
$this->assertTokenization(
'<a href="><>">',
array( new HTMLPurifier_Token_Empty('a', array('href' => '><>')) ),
array(
'DirectLex' => array(
new HTMLPurifier_Token_Start('a', array('href' => '')),
new HTMLPurifier_Token_Text('<">'),
),
'PEARSax3' => false,
)
);
}
function test_tokenizeHTML_emptyTagWithSlashInAttribute() {
$this->assertTokenization(
'<param name="src" value="http://example.com/video.wmv" />',
array( new HTMLPurifier_Token_Empty('param', array('name' => 'src', 'value' => 'http://example.com/video.wmv')) )
);
}
/*
function test_tokenizeHTML_() {
$this->assertTokenization(
,
array(
)
);
}
*/
} }

@ -16,6 +16,7 @@ class HTMLPurifier_SimpleTest_Reporter extends HTMLReporter
?>><?php echo $file ?></option> ?>><?php echo $file ?></option>
<?php } ?> <?php } ?>
</select> </select>
<input type="checkbox" name="standalone" title="Standalone version?" <?php if(isset($_GET['standalone'])) {echo 'checked="checked" ';} ?>/>
<input type="submit" value="Go"> <input type="submit" value="Go">
</form> </form>
<?php <?php

@ -11,26 +11,36 @@ class HTMLPurifier_Strategy_CoreTest extends HTMLPurifier_StrategyHarness
$this->obj = new HTMLPurifier_Strategy_Core(); $this->obj = new HTMLPurifier_Strategy_Core();
} }
function test() { function testBlankInput() {
$this->assertResult(''); $this->assertResult('');
}
function testMakeWellFormed() {
$this->assertResult( $this->assertResult(
'<b>Make well formed.', '<b>Make well formed.',
'<b>Make well formed.</b>' '<b>Make well formed.</b>'
); );
}
function testFixNesting() {
$this->assertResult( $this->assertResult(
'<b><div>Fix nesting.</div></b>', '<b><div>Fix nesting.</div></b>',
'<b></b><div>Fix nesting.</div>' '<b></b><div>Fix nesting.</div>'
); );
}
function testRemoveForeignElements() {
$this->assertResult( $this->assertResult(
'<asdf>Foreign element removal.</asdf>', '<asdf>Foreign element removal.</asdf>',
'Foreign element removal.' 'Foreign element removal.'
); );
}
function testFirstThree() {
$this->assertResult( $this->assertResult(
'<foo><b><div>All three.</div></b>', '<foo><b><div>All three.</div></b>',
'<b></b><div>All three.</div>' '<b></b><div>All three.</div>'
); );
} }
} }

@ -11,79 +11,81 @@ class HTMLPurifier_Strategy_FixNestingTest extends HTMLPurifier_StrategyHarness
$this->obj = new HTMLPurifier_Strategy_FixNesting(); $this->obj = new HTMLPurifier_Strategy_FixNesting();
} }
function testBlockAndInlineIntegration() { function testPreserveInlineInRoot() {
// legal inline
$this->assertResult('<b>Bold text</b>'); $this->assertResult('<b>Bold text</b>');
}
// legal inline and block (default parent element is FLOW)
function testPreserveInlineAndBlockInRoot() {
$this->assertResult('<a href="about:blank">Blank</a><div>Block</div>'); $this->assertResult('<a href="about:blank">Blank</a><div>Block</div>');
}
// illegal block in inline
function testRemoveBlockInInline() {
$this->assertResult( $this->assertResult(
'<b><div>Illegal div.</div></b>', '<b><div>Illegal div.</div></b>',
'<b>Illegal div.</b>' '<b>Illegal div.</b>'
); );
// same test with different configuration (fragile)
$this->assertResult(
'<b><div>Illegal div.</div></b>',
'<b>&lt;div&gt;Illegal div.&lt;/div&gt;</b>',
array('Core.EscapeInvalidChildren' => true)
);
} }
function testNodeRemovalIntegration() { function testEscapeBlockInInline() {
$this->config->set('Core', 'EscapeInvalidChildren', true);
// test of empty set that's required, resulting in removal of node $this->assertResult(
'<b><div>Illegal div.</div></b>',
'<b>&lt;div&gt;Illegal div.&lt;/div&gt;</b>'
);
}
function testRemoveNodeWithMissingRequiredElements() {
$this->assertResult('<ul></ul>', ''); $this->assertResult('<ul></ul>', '');
}
// test illegal text which gets removed
function testRemoveIllegalPCDATA() {
$this->assertResult( $this->assertResult(
'<ul>Illegal text<li>Legal item</li></ul>', '<ul>Illegal text<li>Legal item</li></ul>',
'<ul><li>Legal item</li></ul>' '<ul><li>Legal item</li></ul>'
); );
} }
function testTableIntegration() { function testCustomTableDefinition() {
// test custom table definition $this->assertResult('<table><tr><td>Cell 1</td></tr></table>');
$this->assertResult( }
'<table><tr><td>Cell 1</td></tr></table>'
); function testRemoveEmptyTable() {
$this->assertResult('<table></table>', ''); $this->assertResult('<table></table>', '');
} }
function testChameleonIntegration() { function testChameleonRemoveBlockInNodeInInline() {
// block in inline ins not allowed
$this->assertResult( $this->assertResult(
'<span><ins><div>Not allowed!</div></ins></span>', '<span><ins><div>Not allowed!</div></ins></span>',
'<span><ins>Not allowed!</ins></span>' '<span><ins>Not allowed!</ins></span>'
); );
}
// test block element that has inline content
function testChameleonRemoveBlockInBlockNodeWithInlineContent() {
$this->assertResult( $this->assertResult(
'<h1><ins><div>Not allowed!</div></ins></h1>', '<h1><ins><div>Not allowed!</div></ins></h1>',
'<h1><ins>Not allowed!</ins></h1>' '<h1><ins>Not allowed!</ins></h1>'
); );
}
// stacked ins/del
function testNestedChameleonRemoveBlockInNodeWithInlineContent() {
$this->assertResult( $this->assertResult(
'<h1><ins><del><div>Not allowed!</div></del></ins></h1>', '<h1><ins><del><div>Not allowed!</div></del></ins></h1>',
'<h1><ins><del>Not allowed!</del></ins></h1>' '<h1><ins><del>Not allowed!</del></ins></h1>'
); );
}
function testNestedChameleonPreserveBlockInBlock() {
$this->assertResult( $this->assertResult(
'<div><ins><del><div>Allowed!</div></del></ins></div>' '<div><ins><del><div>Allowed!</div></del></ins></div>'
); );
}
function testChameleonEscapeInvalidBlockInInline() {
$this->config->set('Core', 'EscapeInvalidChildren', true);
$this->assertResult( // alt config $this->assertResult( // alt config
'<span><ins><div>Not allowed!</div></ins></span>', '<span><ins><div>Not allowed!</div></ins></span>',
'<span><ins>&lt;div&gt;Not allowed!&lt;/div&gt;</ins></span>', '<span><ins>&lt;div&gt;Not allowed!&lt;/div&gt;</ins></span>'
array('Core.EscapeInvalidChildren' => true)
); );
} }
function testExclusionsIntegration() { function testExclusionsIntegration() {
@ -93,41 +95,37 @@ class HTMLPurifier_Strategy_FixNestingTest extends HTMLPurifier_StrategyHarness
'<a><span></span></a>' '<a><span></span></a>'
); );
} }
function testCustomParentIntegration() { function testPreserveInlineNodeInInlineRootNode() {
// test inline parent $this->config->set('HTML', 'Parent', 'span');
$this->assertResult( $this->assertResult('<b>Bold</b>');
'<b>Bold</b>', true, array('HTML.Parent' => 'span')
);
$this->assertResult(
'<div>Reject</div>', 'Reject', array('HTML.Parent' => 'span')
);
}
function testError() {
// test fallback to div
$this->expectError('Cannot use unrecognized element as parent.');
$this->assertResult(
'<div>Accept</div>', true, array('HTML.Parent' => 'obviously-impossible')
);
$this->swallowErrors();
} }
function testDoubleCheckIntegration() { function testRemoveBlockNodeInInlineRootNode() {
// breaks without the redundant checking code $this->config->set('HTML', 'Parent', 'span');
$this->assertResult('<div>Reject</div>', 'Reject');
}
function testInvalidParentError() {
// test fallback to div
$this->config->set('HTML', 'Parent', 'obviously-impossible');
$this->expectError('Cannot use unrecognized element as parent');
$this->assertResult('<div>Accept</div>');
}
function testCascadingRemovalOfNodesMissingRequiredChildren() {
$this->assertResult('<table><tr></tr></table>', ''); $this->assertResult('<table><tr></tr></table>', '');
}
// special case, prevents scrolling one back to find parent
function testCascadingRemovalSpecialCaseCannotScrollOneBack() {
$this->assertResult('<table><tr></tr><tr></tr></table>', ''); $this->assertResult('<table><tr></tr><tr></tr></table>', '');
}
// cascading rollbacks
$this->assertResult( function testLotsOfCascadingRemovalOfNodes() {
'<table><tbody><tr></tr><tr></tr></tbody><tr></tr><tr></tr></table>', $this->assertResult('<table><tbody><tr></tr><tr></tr></tbody><tr></tr><tr></tr></table>', '');
'' }
);
function testAdjacentRemovalOfNodeMissingRequiredChildren() {
// rollbacks twice
$this->assertResult('<table></table><table></table>', ''); $this->assertResult('<table></table><table></table>', '');
} }

@ -9,113 +9,77 @@ class HTMLPurifier_Strategy_MakeWellFormedTest extends HTMLPurifier_StrategyHarn
function setUp() { function setUp() {
parent::setUp(); parent::setUp();
$this->obj = new HTMLPurifier_Strategy_MakeWellFormed(); $this->obj = new HTMLPurifier_Strategy_MakeWellFormed();
$this->config = array();
} }
function testNormalIntegration() { function testEmptyInput() {
$this->assertResult(''); $this->assertResult('');
}
function testWellFormedInput() {
$this->assertResult('This is <b>bold text</b>.'); $this->assertResult('This is <b>bold text</b>.');
} }
function testUnclosedTagIntegration() { function testUnclosedTagTerminatedByDocumentEnd() {
$this->assertResult( $this->assertResult(
'<b>Unclosed tag, gasp!', '<b>Unclosed tag, gasp!',
'<b>Unclosed tag, gasp!</b>' '<b>Unclosed tag, gasp!</b>'
); );
}
function testUnclosedTagTerminatedByParentNodeEnd() {
$this->assertResult( $this->assertResult(
'<b><i>Bold and italic?</b>', '<b><i>Bold and italic?</b>',
'<b><i>Bold and italic?</i></b>' '<b><i>Bold and italic?</i></b>'
); );
}
function testRemoveStrayClosingTag() {
$this->assertResult( $this->assertResult(
'Unused end tags... recycle!</b>', 'Unused end tags... recycle!</b>',
'Unused end tags... recycle!' 'Unused end tags... recycle!'
); );
} }
function testEmptyTagDetectionIntegration() { function testConvertStartToEmpty() {
$this->assertResult( $this->assertResult(
'<br style="clear:both;">', '<br style="clear:both;">',
'<br style="clear:both;" />' '<br style="clear:both;" />'
); );
}
function testConvertEmptyToStart() {
$this->assertResult( $this->assertResult(
'<div style="clear:both;" />', '<div style="clear:both;" />',
'<div style="clear:both;"></div>' '<div style="clear:both;"></div>'
); );
} }
function testAutoClose() { function testAutoCloseParagraph() {
// paragraph
$this->assertResult( $this->assertResult(
'<p>Paragraph 1<p>Paragraph 2', '<p>Paragraph 1<p>Paragraph 2',
'<p>Paragraph 1</p><p>Paragraph 2</p>' '<p>Paragraph 1</p><p>Paragraph 2</p>'
); );
}
function testAutoCloseParagraphInsideDiv() {
$this->assertResult( $this->assertResult(
'<div><p>Paragraphs<p>In<p>A<p>Div</div>', '<div><p>Paragraphs<p>In<p>A<p>Div</div>',
'<div><p>Paragraphs</p><p>In</p><p>A</p><p>Div</p></div>' '<div><p>Paragraphs</p><p>In</p><p>A</p><p>Div</p></div>'
); );
}
// list
function testAutoCloseListItem() {
$this->assertResult( $this->assertResult(
'<ol><li>Item 1<li>Item 2</ol>', '<ol><li>Item 1<li>Item 2</ol>',
'<ol><li>Item 1</li><li>Item 2</li></ol>' '<ol><li>Item 1</li><li>Item 2</li></ol>'
); );
}
// colgroup
function testAutoCloseColgroup() {
$this->assertResult( $this->assertResult(
'<table><colgroup><col /><tr></tr></table>', '<table><colgroup><col /><tr></tr></table>',
'<table><colgroup><col /></colgroup><tr></tr></table>' '<table><colgroup><col /></colgroup><tr></tr></table>'
); );
}
function testMultipleInjectors() {
$this->config = array('AutoFormat.AutoParagraph' => true, 'AutoFormat.Linkify' => true);
$this->assertResult(
'Foobar',
'<p>Foobar</p>'
);
$this->assertResult(
'http://example.com',
'<p><a href="http://example.com">http://example.com</a></p>'
);
$this->assertResult(
'<b>http://example.com</b>',
'<p><b><a href="http://example.com">http://example.com</a></b></p>'
);
$this->assertResult(
'<b>http://example.com',
'<p><b><a href="http://example.com">http://example.com</a></b></p>'
);
$this->assertResult(
'http://example.com
http://dev.example.com',
'<p><a href="http://example.com">http://example.com</a></p><p><a href="http://dev.example.com">http://dev.example.com</a></p>'
);
$this->assertResult(
'http://example.com <div>http://example.com</div>',
'<p><a href="http://example.com">http://example.com</a> </p><div><a href="http://example.com">http://example.com</a></div>'
);
$this->assertResult(
'This URL http://example.com is what you need',
'<p>This URL <a href="http://example.com">http://example.com</a> is what you need</p>'
);
} }
} }

@ -0,0 +1,65 @@
<?php
require_once 'HTMLPurifier/StrategyHarness.php';
require_once 'HTMLPurifier/Strategy/MakeWellFormed.php';
class HTMLPurifier_Strategy_MakeWellFormed_InjectorTest extends HTMLPurifier_StrategyHarness
{
function setUp() {
parent::setUp();
$this->obj = new HTMLPurifier_Strategy_MakeWellFormed();
$this->config->set('AutoFormat', 'AutoParagraph', true);
$this->config->set('AutoFormat', 'Linkify', true);
}
function testOnlyAutoParagraph() {
$this->assertResult(
'Foobar',
'<p>Foobar</p>'
);
}
function testParagraphWrappingOnlyLink() {
$this->assertResult(
'http://example.com',
'<p><a href="http://example.com">http://example.com</a></p>'
);
}
function testParagraphWrappingNodeContainingLink() {
$this->assertResult(
'<b>http://example.com</b>',
'<p><b><a href="http://example.com">http://example.com</a></b></p>'
);
}
function testParagraphWrappingPoorlyFormedNodeContainingLink() {
$this->assertResult(
'<b>http://example.com',
'<p><b><a href="http://example.com">http://example.com</a></b></p>'
);
}
function testTwoParagraphsContainingOnlyOneLink() {
$this->assertResult(
"http://example.com\n\nhttp://dev.example.com",
'<p><a href="http://example.com">http://example.com</a></p><p><a href="http://dev.example.com">http://dev.example.com</a></p>'
);
}
function testParagraphNextToDivWithLinks() {
$this->assertResult(
'http://example.com <div>http://example.com</div>',
'<p><a href="http://example.com">http://example.com</a> </p><div><a href="http://example.com">http://example.com</a></div>'
);
}
function testRealisticLinkInSentence() {
$this->assertResult(
'This URL http://example.com is what you need',
'<p>This URL <a href="http://example.com">http://example.com</a> is what you need</p>'
);
}
}

@ -3,8 +3,7 @@
require_once 'HTMLPurifier/StrategyHarness.php'; require_once 'HTMLPurifier/StrategyHarness.php';
require_once 'HTMLPurifier/Strategy/RemoveForeignElements.php'; require_once 'HTMLPurifier/Strategy/RemoveForeignElements.php';
class HTMLPurifier_Strategy_RemoveForeignElementsTest class HTMLPurifier_Strategy_RemoveForeignElementsTest extends HTMLPurifier_StrategyHarness
extends HTMLPurifier_StrategyHarness
{ {
function setUp() { function setUp() {
@ -12,96 +11,75 @@ class HTMLPurifier_Strategy_RemoveForeignElementsTest
$this->obj = new HTMLPurifier_Strategy_RemoveForeignElements(); $this->obj = new HTMLPurifier_Strategy_RemoveForeignElements();
} }
function test() { function testBlankInput() {
$this->config = array('HTML.Doctype' => 'XHTML 1.0 Strict');
$this->assertResult(''); $this->assertResult('');
}
function testPreserveRecognizedElements() {
$this->assertResult('This is <b>bold text</b>.'); $this->assertResult('This is <b>bold text</b>.');
}
function testRemoveForeignElements() {
$this->assertResult( $this->assertResult(
'<asdf>Bling</asdf><d href="bang">Bong</d><foobar />', '<asdf>Bling</asdf><d href="bang">Bong</d><foobar />',
'BlingBong' 'BlingBong'
); );
}
function testRemoveScriptAndContents() {
$this->assertResult( $this->assertResult(
'<script>alert();</script>', '<script>alert();</script>',
'' ''
); );
}
function testRemoveStyleAndContents() {
$this->assertResult( $this->assertResult(
'<style>.foo {blink;}</style>', '<style>.foo {blink;}</style>',
'' ''
); );
}
function testRemoveOnlyScriptTagsLegacy() {
$this->config->set('Core', 'RemoveScriptContents', false);
$this->assertResult( $this->assertResult(
'<script>alert();</script>', '<script>alert();</script>',
'alert();', 'alert();'
array('Core.RemoveScriptContents' => false)
); );
}
function testRemoveOnlyScriptTags() {
$this->config->set('Core', 'HiddenElements', array());
$this->assertResult( $this->assertResult(
'<script>alert();</script>', '<script>alert();</script>',
'alert();', 'alert();'
array('Core.HiddenElements' => array())
); );
}
$this->assertResult(
'<menu><li>Item 1</li></menu>', function testRemoveInvalidImg() {
'<ul><li>Item 1</li></ul>' $this->assertResult('<img />', '');
); }
// test center transform function testPreserveValidImg() {
$this->assertResult(
'<center>Look I am Centered!</center>',
'<div style="text-align:center;">Look I am Centered!</div>'
);
// test font transform
$this->assertResult(
'<font color="red" face="Arial" size="6">Big Warning!</font>',
'<span style="color:red;font-family:Arial;font-size:xx-large;">Big'.
' Warning!</span>'
);
// test removal of invalid img tag
$this->assertResult(
'<img />',
''
);
// test preservation of valid img tag
$this->assertResult('<img src="foobar.gif" alt="foobar.gif" />'); $this->assertResult('<img src="foobar.gif" alt="foobar.gif" />');
}
// test preservation of invalid img tag when removal is disabled
$this->assertResult( function testPreserveInvalidImgWhenRemovalIsDisabled() {
'<img />', $this->config->set('Core', 'RemoveInvalidImg', false);
true, $this->assertResult('<img />');
array( }
'Core.RemoveInvalidImg' => false
) function testTextifyCommentedScriptContents() {
); $this->config->set('HTML', 'Trusted', true);
$this->config->set('Output', 'CommentScriptContents', false); // simplify output
// test transform to unallowed element
$this->assertResult(
'<font color="red" face="Arial" size="6">Big Warning!</font>',
'Big Warning!',
array('HTML.Allowed' => 'div')
);
// text-ify commented script contents ( the trailing comment gets
// removed during generation )
$this->assertResult( $this->assertResult(
'<script type="text/javascript"><!-- '<script type="text/javascript"><!--
alert(<b>bold</b>); alert(<b>bold</b>);
// --></script>', // --></script>',
'<script type="text/javascript"> '<script type="text/javascript">
alert(&lt;b&gt;bold&lt;/b&gt;); alert(&lt;b&gt;bold&lt;/b&gt;);
// </script>', // </script>'
array('HTML.Trusted' => true, 'Output.CommentScriptContents' => false)
); );
} }
} }

@ -0,0 +1,46 @@
<?php
require_once 'HTMLPurifier/StrategyHarness.php';
require_once 'HTMLPurifier/Strategy/RemoveForeignElements.php';
class HTMLPurifier_Strategy_RemoveForeignElements_TidyTest
extends HTMLPurifier_StrategyHarness
{
function setUp() {
parent::setUp();
$this->obj = new HTMLPurifier_Strategy_RemoveForeignElements();
$this->config->set('HTML', 'TidyLevel', 'heavy');
}
function testCenterTransform() {
$this->assertResult(
'<center>Look I am Centered!</center>',
'<div style="text-align:center;">Look I am Centered!</div>'
);
}
function testFontTransform() {
$this->assertResult(
'<font color="red" face="Arial" size="6">Big Warning!</font>',
'<span style="color:red;font-family:Arial;font-size:xx-large;">Big'.
' Warning!</span>'
);
}
function testTransformToForbiddenElement() {
$this->config->set('HTML', 'Allowed', 'div');
$this->assertResult(
'<font color="red" face="Arial" size="6">Big Warning!</font>',
'Big Warning!'
);
}
function testMenuTransform() {
$this->assertResult(
'<menu><li>Item 1</li></menu>',
'<ul><li>Item 1</li></ul>'
);
}
}

@ -1,6 +1,5 @@
<?php <?php
require_once('HTMLPurifier/Config.php');
require_once('HTMLPurifier/StrategyHarness.php'); require_once('HTMLPurifier/StrategyHarness.php');
require_once('HTMLPurifier/Strategy/ValidateAttributes.php'); require_once('HTMLPurifier/Strategy/ValidateAttributes.php');
@ -11,126 +10,99 @@ class HTMLPurifier_Strategy_ValidateAttributesTest extends
function setUp() { function setUp() {
parent::setUp(); parent::setUp();
$this->obj = new HTMLPurifier_Strategy_ValidateAttributes(); $this->obj = new HTMLPurifier_Strategy_ValidateAttributes();
$this->config = array('HTML.Doctype' => 'XHTML 1.0 Strict');
} }
function testEmpty() { function testEmptyInput() {
$this->assertResult(''); $this->assertResult('');
} }
function testIDs() { function testRemoveIDByDefault() {
$this->assertResult( $this->assertResult(
'<div id="valid">Kill the ID.</div>', '<div id="valid">Kill the ID.</div>',
'<div>Kill the ID.</div>' '<div>Kill the ID.</div>'
); );
}
$this->assertResult('<div id="valid">Preserve the ID.</div>', true,
array('HTML.EnableAttrID' => true)); function testRemoveInvalidDir() {
$this->assertResult(
'<div id="0invalid">Kill the ID.</div>',
'<div>Kill the ID.</div>',
array('HTML.EnableAttrID' => true)
);
// test id accumulator
$this->assertResult(
'<div id="valid">Valid</div><div id="valid">Invalid</div>',
'<div id="valid">Valid</div><div>Invalid</div>',
array('HTML.EnableAttrID' => true)
);
$this->assertResult( $this->assertResult(
'<span dir="up-to-down">Bad dir.</span>', '<span dir="up-to-down">Bad dir.</span>',
'<span>Bad dir.</span>' '<span>Bad dir.</span>'
); );
// test attribute key case sensitivity
$this->assertResult(
'<div ID="valid">Convert ID to lowercase.</div>',
'<div id="valid">Convert ID to lowercase.</div>',
array('HTML.EnableAttrID' => true)
);
// test simple attribute substitution
$this->assertResult(
'<div id=" valid ">Trim whitespace.</div>',
'<div id="valid">Trim whitespace.</div>',
array('HTML.EnableAttrID' => true)
);
// test configuration id blacklist
$this->assertResult(
'<div id="invalid">Invalid</div>',
'<div>Invalid</div>',
array(
'Attr.IDBlacklist' => array('invalid'),
'HTML.EnableAttrID' => true
)
);
// name rewritten as id
$this->assertResult(
'<a name="foobar" />',
'<a id="foobar" />',
array('HTML.EnableAttrID' => true)
);
} }
function testClasses() { function testPreserveValidClass() {
$this->assertResult('<div class="valid">Valid</div>'); $this->assertResult('<div class="valid">Valid</div>');
}
function testSelectivelyRemoveInvalidClasses() {
$this->assertResult( $this->assertResult(
'<div class="valid 0invalid">Keep valid.</div>', '<div class="valid 0invalid">Keep valid.</div>',
'<div class="valid">Keep valid.</div>' '<div class="valid">Keep valid.</div>'
); );
} }
function testTitle() { function testPreserveTitle() {
$this->assertResult( $this->assertResult(
'<acronym title="PHP: Hypertext Preprocessor">PHP</acronym>' '<acronym title="PHP: Hypertext Preprocessor">PHP</acronym>'
); );
} }
function testLang() { function testAddXMLLang() {
$this->assertResult( $this->assertResult(
'<span lang="fr">La soupe.</span>', '<span lang="fr">La soupe.</span>',
'<span lang="fr" xml:lang="fr">La soupe.</span>' '<span lang="fr" xml:lang="fr">La soupe.</span>'
); );
}
// test only xml:lang for XHTML 1.1
function testOnlyXMLLangInXHTML11() {
$this->config->set('HTML', 'Doctype', 'XHTML 1.1');
$this->assertResult( $this->assertResult(
'<b lang="en">asdf</b>', '<b lang="en">asdf</b>',
'<b xml:lang="en">asdf</b>', array('HTML.Doctype' => 'XHTML 1.1') '<b xml:lang="en">asdf</b>'
); );
} }
function testAlign() { function testBasicURI() {
$this->assertResult('<a href="http://www.google.com/">Google</a>');
$this->assertResult(
'<h1 align="center">Centered Headline</h1>',
'<h1 style="text-align:center;">Centered Headline</h1>'
);
$this->assertResult(
'<h1 align="right">Right-aligned Headline</h1>',
'<h1 style="text-align:right;">Right-aligned Headline</h1>'
);
$this->assertResult(
'<h1 align="left">Left-aligned Headline</h1>',
'<h1 style="text-align:left;">Left-aligned Headline</h1>'
);
$this->assertResult(
'<p align="justify">Justified Paragraph</p>',
'<p style="text-align:justify;">Justified Paragraph</p>'
);
$this->assertResult(
'<h1 align="invalid">Invalid Headline</h1>',
'<h1>Invalid Headline</h1>'
);
} }
function testTable() { function testInvalidURI() {
$this->assertResult(
'<a href="javascript:badstuff();">Google</a>',
'<a>Google</a>'
);
}
function testBdoAddMissingDir() {
$this->assertResult(
'<bdo>Go left.</bdo>',
'<bdo dir="ltr">Go left.</bdo>'
);
}
function testBdoReplaceInvalidDirWithDefault() {
$this->assertResult(
'<bdo dir="blahblah">Invalid value!</bdo>',
'<bdo dir="ltr">Invalid value!</bdo>'
);
}
function testBdoAlternateDefaultDir() {
$this->config->set('Attr', 'DefaultTextDir', 'rtl');
$this->assertResult(
'<bdo>Go right.</bdo>',
'<bdo dir="rtl">Go right.</bdo>'
);
}
function testRemoveDirWhenNotRequired() {
$this->assertResult(
'<span dir="blahblah">Invalid value!</span>',
'<span>Invalid value!</span>'
);
}
function testTableAttributes() {
$this->assertResult( $this->assertResult(
'<table frame="above" rules="rows" summary="A test table" border="2" cellpadding="5%" cellspacing="3" width="100%"> '<table frame="above" rules="rows" summary="A test table" border="2" cellpadding="5%" cellspacing="3" width="100%">
<col align="right" width="4*" /> <col align="right" width="4*" />
@ -148,293 +120,64 @@ class HTMLPurifier_Strategy_ValidateAttributesTest extends
</tr> </tr>
</table>' </table>'
); );
}
// test col.span is non-zero
function testColSpanIsNonZero() {
$this->assertResult( $this->assertResult(
'<col span="0" />', '<col span="0" />',
'<col />' '<col />'
); );
// lengths
$this->assertResult(
'<td width="5%" height="10" /><th width="10" height="5%" /><hr width="10" height="10" />',
'<td style="width:5%;height:10px;" /><th style="width:10px;height:5%;" /><hr style="width:10px;" />'
);
// td boolean transformation
$this->assertResult(
'<td nowrap />',
'<td style="white-space:nowrap;" />'
);
// caption align transformation
$this->assertResult(
'<caption align="left" />',
'<caption style="text-align:left;" />'
);
$this->assertResult(
'<caption align="right" />',
'<caption style="text-align:right;" />'
);
$this->assertResult(
'<caption align="top" />',
'<caption style="caption-side:top;" />'
);
$this->assertResult(
'<caption align="bottom" />',
'<caption style="caption-side:bottom;" />'
);
$this->assertResult(
'<caption align="nonsense" />',
'<caption />'
);
// align transformation
$this->assertResult(
'<table align="left" />',
'<table style="float:left;" />'
);
$this->assertResult(
'<table align="center" />',
'<table style="margin-left:auto;margin-right:auto;" />'
);
$this->assertResult(
'<table align="right" />',
'<table style="float:right;" />'
);
$this->assertResult(
'<table align="top" />',
'<table />'
);
} }
function testURI() { function testImgAddDefaults() {
$this->assertResult('<a href="http://www.google.com/">Google</a>'); $this->config->set('Core', 'RemoveInvalidImg', false);
// test invalid URI
$this->assertResult(
'<a href="javascript:badstuff();">Google</a>',
'<a>Google</a>'
);
}
function testImg() {
$this->assertResult( $this->assertResult(
'<img />', '<img />',
'<img src="" alt="Invalid image" />', '<img src="" alt="Invalid image" />'
array('Core.RemoveInvalidImg' => false)
); );
}
function testImgGenerateAlt() {
$this->assertResult( $this->assertResult(
'<img src="foobar.jpg" />', '<img src="foobar.jpg" />',
'<img src="foobar.jpg" alt="foobar.jpg" />' '<img src="foobar.jpg" alt="foobar.jpg" />'
); );
}
function testImgAddDefaultSrc() {
$this->config->set('Core', 'RemoveInvalidImg', false);
$this->assertResult( $this->assertResult(
'<img alt="pretty picture" />', '<img alt="pretty picture" />',
'<img alt="pretty picture" src="" />', '<img alt="pretty picture" src="" />'
array('Core.RemoveInvalidImg' => false)
); );
// mailto in image is not allowed }
function testImgRemoveNonRetrievableProtocol() {
$this->config->set('Core', 'RemoveInvalidImg', false);
$this->assertResult( $this->assertResult(
'<img src="mailto:foo@example.com" />', '<img src="mailto:foo@example.com" />',
'<img alt="mailto:foo@example.com" src="" />', '<img alt="mailto:foo@example.com" src="" />'
array('Core.RemoveInvalidImg' => false)
);
// align transformation
$this->assertResult(
'<img src="foobar.jpg" alt="foobar" align="left" />',
'<img src="foobar.jpg" alt="foobar" style="float:left;" />'
);
$this->assertResult(
'<img src="foobar.jpg" alt="foobar" align="right" />',
'<img src="foobar.jpg" alt="foobar" style="float:right;" />'
);
$this->assertResult(
'<img src="foobar.jpg" alt="foobar" align="bottom" />',
'<img src="foobar.jpg" alt="foobar" style="vertical-align:baseline;" />'
);
$this->assertResult(
'<img src="foobar.jpg" alt="foobar" align="middle" />',
'<img src="foobar.jpg" alt="foobar" style="vertical-align:middle;" />'
);
$this->assertResult(
'<img src="foobar.jpg" alt="foobar" align="top" />',
'<img src="foobar.jpg" alt="foobar" style="vertical-align:top;" />'
);
$this->assertResult(
'<img src="foobar.jpg" alt="foobar" align="outerspace" />',
'<img src="foobar.jpg" alt="foobar" />'
);
}
function testBdo() {
// test required attributes for bdo
$this->assertResult(
'<bdo>Go left.</bdo>',
'<bdo dir="ltr">Go left.</bdo>'
);
$this->assertResult(
'<bdo dir="blahblah">Invalid value!</bdo>',
'<bdo dir="ltr">Invalid value!</bdo>'
); );
} }
function testDir() { function testPreserveRel() {
// see testBdo, behavior is subtly different $this->config->set('Attr', 'AllowedRel', 'nofollow');
$this->assertResult( $this->assertResult('<a href="foo" rel="nofollow" />');
'<span dir="blahblah">Invalid value!</span>',
'<span>Invalid value!</span>'
);
} }
function testLinks() { function testPreserveTarget() {
// link types $this->config->set('Attr', 'AllowedFrameTargets', '_top');
$this->assertResult( $this->config->set('HTML', 'Doctype', 'XHTML 1.0 Transitional');
'<a href="foo" rel="nofollow" />', $this->assertResult('<a href="foo" target="_top" />');
true, }
array('Attr.AllowedRel' => 'nofollow')
); function testRemoveTargetWhenNotSupported() {
// link targets $this->config->set('HTML', 'Doctype', 'XHTML 1.0 Strict');
$this->assertResult( $this->config->set('Attr', 'AllowedFrameTargets', '_top');
'<a href="foo" target="_top" />',
true,
array('Attr.AllowedFrameTargets' => '_top',
'HTML.Doctype' => 'XHTML 1.0 Transitional')
);
$this->assertResult( $this->assertResult(
'<a href="foo" target="_top" />', '<a href="foo" target="_top" />',
'<a href="foo" />' '<a href="foo" />'
); );
$this->assertResult(
'<a href="foo" target="_top" />',
'<a href="foo" />',
array('Attr.AllowedFrameTargets' => '_top', 'HTML.Strict' => true)
);
}
function testBorder() {
// border
$this->assertResult(
'<img src="foo" alt="foo" hspace="1" vspace="3" />',
'<img src="foo" alt="foo" style="margin-top:3px;margin-bottom:3px;margin-left:1px;margin-right:1px;" />',
array('Attr.AllowedRel' => 'nofollow')
);
}
function testHr() {
$this->assertResult(
'<hr size="3" />',
'<hr style="height:3px;" />'
);
$this->assertResult(
'<hr noshade />',
'<hr style="color:#808080;background-color:#808080;border:0;" />'
);
// align transformation
$this->assertResult(
'<hr align="left" />',
'<hr style="margin-left:0;margin-right:auto;text-align:left;" />'
);
$this->assertResult(
'<hr align="center" />',
'<hr style="margin-left:auto;margin-right:auto;text-align:center;" />'
);
$this->assertResult(
'<hr align="right" />',
'<hr style="margin-left:auto;margin-right:0;text-align:right;" />'
);
$this->assertResult(
'<hr align="bottom" />',
'<hr />'
);
}
function testBr() {
// br clear transformation
$this->assertResult(
'<br clear="left" />',
'<br style="clear:left;" />'
);
$this->assertResult(
'<br clear="right" />',
'<br style="clear:right;" />'
);
$this->assertResult( // test both?
'<br clear="all" />',
'<br style="clear:both;" />'
);
$this->assertResult(
'<br clear="none" />',
'<br style="clear:none;" />'
);
$this->assertResult(
'<br clear="foo" />',
'<br />'
);
}
function testListTypeTransform() {
// ul
$this->assertResult(
'<ul type="disc" />',
'<ul style="list-style-type:disc;" />'
);
$this->assertResult(
'<ul type="square" />',
'<ul style="list-style-type:square;" />'
);
$this->assertResult(
'<ul type="circle" />',
'<ul style="list-style-type:circle;" />'
);
$this->assertResult( // case insensitive
'<ul type="CIRCLE" />',
'<ul style="list-style-type:circle;" />'
);
$this->assertResult(
'<ul type="a" />',
'<ul />'
);
// ol
$this->assertResult(
'<ol type="1" />',
'<ol style="list-style-type:decimal;" />'
);
$this->assertResult(
'<ol type="i" />',
'<ol style="list-style-type:lower-roman;" />'
);
$this->assertResult(
'<ol type="I" />',
'<ol style="list-style-type:upper-roman;" />'
);
$this->assertResult(
'<ol type="a" />',
'<ol style="list-style-type:lower-alpha;" />'
);
$this->assertResult(
'<ol type="A" />',
'<ol style="list-style-type:upper-alpha;" />'
);
$this->assertResult(
'<ol type="disc" />',
'<ol />'
);
// li
$this->assertResult(
'<li type="circle" />',
'<li style="list-style-type:circle;" />'
);
$this->assertResult(
'<li type="A" />',
'<li style="list-style-type:upper-alpha;" />'
);
$this->assertResult( // case sensitive
'<li type="CIRCLE" />',
'<li />'
);
} }
} }

@ -0,0 +1,65 @@
<?php
require_once('HTMLPurifier/StrategyHarness.php');
require_once('HTMLPurifier/Strategy/ValidateAttributes.php');
class HTMLPurifier_Strategy_ValidateAttributes_IDTest extends HTMLPurifier_StrategyHarness
{
function setUp() {
parent::setUp();
$this->obj = new HTMLPurifier_Strategy_ValidateAttributes();
$this->config->set('HTML', 'EnableAttrID', true);
}
function testPreserveIDWhenEnabled() {
$this->assertResult('<div id="valid">Preserve the ID.</div>');
}
function testRemoveInvalidID() {
$this->assertResult(
'<div id="0invalid">Kill the ID.</div>',
'<div>Kill the ID.</div>'
);
}
function testRemoveDuplicateID() {
$this->assertResult(
'<div id="valid">Valid</div><div id="valid">Invalid</div>',
'<div id="valid">Valid</div><div>Invalid</div>'
);
}
function testAttributeKeyCaseInsensitivity() {
$this->assertResult(
'<div ID="valid">Convert ID to lowercase.</div>',
'<div id="valid">Convert ID to lowercase.</div>'
);
}
function testTrimWhitespace() {
$this->assertResult(
'<div id=" valid ">Trim whitespace.</div>',
'<div id="valid">Trim whitespace.</div>'
);
}
function testIDBlacklist() {
$this->config->set('Attr', 'IDBlacklist', array('invalid'));
$this->assertResult(
'<div id="invalid">Invalid</div>',
'<div>Invalid</div>'
);
}
function testNameConvertedToID() {
$this->config->set('HTML', 'TidyLevel', 'heavy');
$this->assertResult(
'<a name="foobar" />',
'<a id="foobar" />'
);
}
}

@ -0,0 +1,353 @@
<?php
require_once('HTMLPurifier/StrategyHarness.php');
require_once('HTMLPurifier/Strategy/ValidateAttributes.php');
class HTMLPurifier_Strategy_ValidateAttributes_TidyTest extends HTMLPurifier_StrategyHarness
{
function setUp() {
parent::setUp();
$this->obj = new HTMLPurifier_Strategy_ValidateAttributes();
$this->config->set('HTML', 'TidyLevel', 'heavy');
}
function testConvertCenterAlign() {
$this->assertResult(
'<h1 align="center">Centered Headline</h1>',
'<h1 style="text-align:center;">Centered Headline</h1>'
);
}
function testConvertRightAlign() {
$this->assertResult(
'<h1 align="right">Right-aligned Headline</h1>',
'<h1 style="text-align:right;">Right-aligned Headline</h1>'
);
}
function testConvertLeftAlign() {
$this->assertResult(
'<h1 align="left">Left-aligned Headline</h1>',
'<h1 style="text-align:left;">Left-aligned Headline</h1>'
);
}
function testConvertJustifyAlign() {
$this->assertResult(
'<p align="justify">Justified Paragraph</p>',
'<p style="text-align:justify;">Justified Paragraph</p>'
);
}
function testRemoveInvalidAlign() {
$this->assertResult(
'<h1 align="invalid">Invalid Headline</h1>',
'<h1>Invalid Headline</h1>'
);
}
function testConvertTableLengths() {
$this->assertResult(
'<td width="5%" height="10" /><th width="10" height="5%" /><hr width="10" height="10" />',
'<td style="width:5%;height:10px;" /><th style="width:10px;height:5%;" /><hr style="width:10px;" />'
);
}
function testTdConvertNowrap() {
$this->assertResult(
'<td nowrap />',
'<td style="white-space:nowrap;" />'
);
}
function testCaptionConvertAlignLeft() {
$this->assertResult(
'<caption align="left" />',
'<caption style="text-align:left;" />'
);
}
function testCaptionConvertAlignRight() {
$this->assertResult(
'<caption align="right" />',
'<caption style="text-align:right;" />'
);
}
function testCaptionConvertAlignTop() {
$this->assertResult(
'<caption align="top" />',
'<caption style="caption-side:top;" />'
);
}
function testCaptionConvertAlignBottom() {
$this->assertResult(
'<caption align="bottom" />',
'<caption style="caption-side:bottom;" />'
);
}
function testCaptionRemoveInvalidAlign() {
$this->assertResult(
'<caption align="nonsense" />',
'<caption />'
);
}
function testTableConvertAlignLeft() {
$this->assertResult(
'<table align="left" />',
'<table style="float:left;" />'
);
}
function testTableConvertAlignCenter() {
$this->assertResult(
'<table align="center" />',
'<table style="margin-left:auto;margin-right:auto;" />'
);
}
function testTableConvertAlignRight() {
$this->assertResult(
'<table align="right" />',
'<table style="float:right;" />'
);
}
function testTableRemoveInvalidAlign() {
$this->assertResult(
'<table align="top" />',
'<table />'
);
}
function testImgConvertAlignLeft() {
$this->assertResult(
'<img src="foobar.jpg" alt="foobar" align="left" />',
'<img src="foobar.jpg" alt="foobar" style="float:left;" />'
);
}
function testImgConvertAlignRight() {
$this->assertResult(
'<img src="foobar.jpg" alt="foobar" align="right" />',
'<img src="foobar.jpg" alt="foobar" style="float:right;" />'
);
}
function testImgConvertAlignBottom() {
$this->assertResult(
'<img src="foobar.jpg" alt="foobar" align="bottom" />',
'<img src="foobar.jpg" alt="foobar" style="vertical-align:baseline;" />'
);
}
function testImgConvertAlignMiddle() {
$this->assertResult(
'<img src="foobar.jpg" alt="foobar" align="middle" />',
'<img src="foobar.jpg" alt="foobar" style="vertical-align:middle;" />'
);
}
function testImgConvertAlignTop() {
$this->assertResult(
'<img src="foobar.jpg" alt="foobar" align="top" />',
'<img src="foobar.jpg" alt="foobar" style="vertical-align:top;" />'
);
}
function testImgRemoveInvalidAlign() {
$this->assertResult(
'<img src="foobar.jpg" alt="foobar" align="outerspace" />',
'<img src="foobar.jpg" alt="foobar" />'
);
}
function testBorderConvertHVSpace() {
$this->assertResult(
'<img src="foo" alt="foo" hspace="1" vspace="3" />',
'<img src="foo" alt="foo" style="margin-top:3px;margin-bottom:3px;margin-left:1px;margin-right:1px;" />'
);
}
function testHrConvertSize() {
$this->assertResult(
'<hr size="3" />',
'<hr style="height:3px;" />'
);
}
function testHrConvertNoshade() {
$this->assertResult(
'<hr noshade />',
'<hr style="color:#808080;background-color:#808080;border:0;" />'
);
}
function testHrConvertAlignLeft() {
$this->assertResult(
'<hr align="left" />',
'<hr style="margin-left:0;margin-right:auto;text-align:left;" />'
);
}
function testHrConvertAlignCenter() {
$this->assertResult(
'<hr align="center" />',
'<hr style="margin-left:auto;margin-right:auto;text-align:center;" />'
);
}
function testHrConvertAlignRight() {
$this->assertResult(
'<hr align="right" />',
'<hr style="margin-left:auto;margin-right:0;text-align:right;" />'
);
}
function testHrRemoveInvalidAlign() {
$this->assertResult(
'<hr align="bottom" />',
'<hr />'
);
}
function testBrConvertClearLeft() {
$this->assertResult(
'<br clear="left" />',
'<br style="clear:left;" />'
);
}
function testBrConvertClearRight() {
$this->assertResult(
'<br clear="right" />',
'<br style="clear:right;" />'
);
}
function testBrConvertClearAll() {
$this->assertResult(
'<br clear="all" />',
'<br style="clear:both;" />'
);
}
function testBrConvertClearNone() {
$this->assertResult(
'<br clear="none" />',
'<br style="clear:none;" />'
);
}
function testBrRemoveInvalidClear() {
$this->assertResult(
'<br clear="foo" />',
'<br />'
);
}
function testUlConvertTypeDisc() {
$this->assertResult(
'<ul type="disc" />',
'<ul style="list-style-type:disc;" />'
);
}
function testUlConvertTypeSquare() {
$this->assertResult(
'<ul type="square" />',
'<ul style="list-style-type:square;" />'
);
}
function testUlConvertTypeCircle() {
$this->assertResult(
'<ul type="circle" />',
'<ul style="list-style-type:circle;" />'
);
}
function testUlConvertTypeCaseInsensitive() {
$this->assertResult(
'<ul type="CIRCLE" />',
'<ul style="list-style-type:circle;" />'
);
}
function testUlRemoveInvalidType() {
$this->assertResult(
'<ul type="a" />',
'<ul />'
);
}
function testOlConvertType1() {
$this->assertResult(
'<ol type="1" />',
'<ol style="list-style-type:decimal;" />'
);
}
function testOlConvertTypeLowerI() {
$this->assertResult(
'<ol type="i" />',
'<ol style="list-style-type:lower-roman;" />'
);
}
function testOlConvertTypeUpperI() {
$this->assertResult(
'<ol type="I" />',
'<ol style="list-style-type:upper-roman;" />'
);
}
function testOlConvertTypeLowerA() {
$this->assertResult(
'<ol type="a" />',
'<ol style="list-style-type:lower-alpha;" />'
);
}
function testOlConvertTypeUpperA() {
$this->assertResult(
'<ol type="A" />',
'<ol style="list-style-type:upper-alpha;" />'
);
}
function testOlRemoveInvalidType() {
$this->assertResult(
'<ol type="disc" />',
'<ol />'
);
}
function testLiConvertTypeCircle() {
$this->assertResult(
'<li type="circle" />',
'<li style="list-style-type:circle;" />'
);
}
function testLiConvertTypeA() {
$this->assertResult(
'<li type="A" />',
'<li style="list-style-type:upper-alpha;" />'
);
}
function testLiConvertTypeCaseSensitive() {
$this->assertResult(
'<li type="CIRCLE" />',
'<li />'
);
}
}

@ -5,7 +5,7 @@
error_reporting(E_ALL | E_STRICT); error_reporting(E_ALL | E_STRICT);
define('HTMLPurifierTest', 1); define('HTMLPurifierTest', 1);
define('HTMLPURIFIER_SCHEMA_STRICT', true); define('HTMLPURIFIER_SCHEMA_STRICT', true); // validate schemas
// wishlist: automated calling of this file from multiple PHP versions so we // wishlist: automated calling of this file from multiple PHP versions so we
// don't have to constantly switch around // don't have to constantly switch around
@ -13,10 +13,11 @@ define('HTMLPURIFIER_SCHEMA_STRICT', true);
// default settings (protect against register_globals) // default settings (protect against register_globals)
$GLOBALS['HTMLPurifierTest'] = array(); $GLOBALS['HTMLPurifierTest'] = array();
$GLOBALS['HTMLPurifierTest']['PEAR'] = false; // do PEAR tests $GLOBALS['HTMLPurifierTest']['PEAR'] = false; // do PEAR tests
$GLOBALS['HTMLPurifierTest']['PH5P'] = version_compare(PHP_VERSION, "5", ">=") && class_exists('DOMDocument');
$simpletest_location = 'simpletest/'; // reasonable guess $simpletest_location = 'simpletest/'; // reasonable guess
// load SimpleTest // load SimpleTest
@include '../test-settings.php'; // don't mind if it isn't there if (file_exists('../test-settings.php')) include '../test-settings.php';
require_once $simpletest_location . 'unit_tester.php'; require_once $simpletest_location . 'unit_tester.php';
require_once $simpletest_location . 'reporter.php'; require_once $simpletest_location . 'reporter.php';
require_once $simpletest_location . 'mock_objects.php'; require_once $simpletest_location . 'mock_objects.php';
@ -79,7 +80,6 @@ if ($test_file = $GLOBALS['HTMLPurifierTest']['File']) {
} else { } else {
$test = new GroupTest('All Tests'); $test = new GroupTest('All Tests');
foreach ($test_files as $test_file) { foreach ($test_files as $test_file) {
require_once $test_file; require_once $test_file;
$test->addTestClass(path2class($test_file)); $test->addTestClass(path2class($test_file));

@ -79,6 +79,7 @@ $test_files[] = 'HTMLPurifier/GeneratorTest.php';
$test_files[] = 'HTMLPurifier/HTMLDefinitionTest.php'; $test_files[] = 'HTMLPurifier/HTMLDefinitionTest.php';
$test_files[] = 'HTMLPurifier/HTMLModuleManagerTest.php'; $test_files[] = 'HTMLPurifier/HTMLModuleManagerTest.php';
$test_files[] = 'HTMLPurifier/HTMLModuleTest.php'; $test_files[] = 'HTMLPurifier/HTMLModuleTest.php';
$test_files[] = 'HTMLPurifier/HTMLModule/ObjectTest.php';
$test_files[] = 'HTMLPurifier/HTMLModule/RubyTest.php'; $test_files[] = 'HTMLPurifier/HTMLModule/RubyTest.php';
$test_files[] = 'HTMLPurifier/HTMLModule/ScriptingTest.php'; $test_files[] = 'HTMLPurifier/HTMLModule/ScriptingTest.php';
$test_files[] = 'HTMLPurifier/HTMLModule/TidyTest.php'; $test_files[] = 'HTMLPurifier/HTMLModule/TidyTest.php';
@ -98,9 +99,13 @@ $test_files[] = 'HTMLPurifier/Strategy/FixNestingTest.php';
$test_files[] = 'HTMLPurifier/Strategy/FixNesting_ErrorsTest.php'; $test_files[] = 'HTMLPurifier/Strategy/FixNesting_ErrorsTest.php';
$test_files[] = 'HTMLPurifier/Strategy/MakeWellFormedTest.php'; $test_files[] = 'HTMLPurifier/Strategy/MakeWellFormedTest.php';
$test_files[] = 'HTMLPurifier/Strategy/MakeWellFormed_ErrorsTest.php'; $test_files[] = 'HTMLPurifier/Strategy/MakeWellFormed_ErrorsTest.php';
$test_files[] = 'HTMLPurifier/Strategy/MakeWellFormed_InjectorTest.php';
$test_files[] = 'HTMLPurifier/Strategy/RemoveForeignElementsTest.php'; $test_files[] = 'HTMLPurifier/Strategy/RemoveForeignElementsTest.php';
$test_files[] = 'HTMLPurifier/Strategy/RemoveForeignElements_ErrorsTest.php'; $test_files[] = 'HTMLPurifier/Strategy/RemoveForeignElements_ErrorsTest.php';
$test_files[] = 'HTMLPurifier/Strategy/RemoveForeignElements_TidyTest.php';
$test_files[] = 'HTMLPurifier/Strategy/ValidateAttributesTest.php'; $test_files[] = 'HTMLPurifier/Strategy/ValidateAttributesTest.php';
$test_files[] = 'HTMLPurifier/Strategy/ValidateAttributes_IDTest.php';
$test_files[] = 'HTMLPurifier/Strategy/ValidateAttributes_TidyTest.php';
$test_files[] = 'HTMLPurifier/TagTransformTest.php'; $test_files[] = 'HTMLPurifier/TagTransformTest.php';
$test_files[] = 'HTMLPurifier/TokenTest.php'; $test_files[] = 'HTMLPurifier/TokenTest.php';
$test_files[] = 'HTMLPurifier/URIDefinitionTest.php'; $test_files[] = 'HTMLPurifier/URIDefinitionTest.php';