diff --git a/NEWS b/NEWS index 4a494326..2bd97555 100644 --- a/NEWS +++ b/NEWS @@ -11,12 +11,22 @@ NEWS ( CHANGELOG and HISTORY ) HTMLPurifier 1.4.0, unknown release date ! Implemented list-style-image, URIs now allowed in list-style -! Implemented background-image, background-repeat and background-attachment - CSS properties. background shorthand property HAS NOT been extended - to allow these, and background-position IS NOT implemented yet. +! Implemented background-image, background-repeat, background-attachment + and background-position CSS properties. Shorthand property background + supports all of these properties. ! Configuration documentation looks nicer -! Added smoketest 'all.php', which loads all other smoketests via frames +! Added %Core.EscapeNonASCIICharacters to workaround loss of Unicode + characters while %Core.Encoding is set to a non-UTF-8 encoding. +! Support for configuration directive aliases added +! Config object can now be instantiated from ini files +! YouTube preservation code added to the core, with two lines of code + you can add it as a filter to your code. See smoketests/preserveYouTube.php + for sample code. +- Replaced version check with functionality check for DOM (thanks Stephen + Khoo) +. Added smoketest 'all.php', which loads all other smoketests via frames . Implemented AttrDef_CSSURI for url(http://google.com) style declarations +. Added convenient single test selector form on test runner 1.3.3, unknown release date, likely to be dropped ! Moved SLOW to docs/enduser-slow.html and added code examples diff --git a/TODO b/TODO index 6a3bc84b..aa625d3a 100644 --- a/TODO +++ b/TODO @@ -7,19 +7,14 @@ TODO List ? At-risk ========================== -1.4 release - # More extensive URI filtering schemes (see docs/proposal-new-directives.txt) - # Allow for background-image and list-style-image (intrinsically tied to above) - # Add hooks for custom behavior (for instance, YouTube preservation) - - Aggressive caching - ? Rich set* methods and config file loaders for HTMLPurifier_Config - ? Configuration profiles: sets of directives that get set with one func call - ? ConfigSchema directive aliases (so we can rename some of them) - ? URI validation routines tighter (see docs/dev-code-quality.html) (COMPLEX) - 1.5 release + # Implement all non-essential attribute transforms + # URI validation routines tighter (see docs/dev-code-quality.html) (COMPLEX) + # Advanced URI filtering schemes (see docs/proposal-new-directives.txt) # Error logging for filtering/cleanup procedures - Requires I18N facilities to be created first (COMPLEX) + ? Configuration profiles: sets of directives that get set with one func call + - XSS-attempt detection 1.6 release # Add pre-packaged "levels" of cleaning (custom behavior already done) @@ -28,14 +23,30 @@ TODO List specification of elements that, when detected as foreign, trigger removal of children, although unbalanced tags could wreck havoc (or at least delete the rest of the document)). + - Allow specifying global attributes on a tag-by-tag basis in + %HTML.AllowAttributes + ? More user-friendly warnings when %HTML.Allow* attempts to specify a + tag or attribute that is not supported + - Parse TinyMCE whitelist into our %HTML.Allow* whitelists 1.7 release # Additional support for poorly written HTML - - Implement all non-essential attribute transforms (BIG!) - Microsoft Word HTML cleaning (i.e. MsoNormal, but research essential!) - Friendly strict handling of
(block ->or related tags) - - Win32 Phalanger C# binaries (?) - - Remove redundant tags, ex. Underlined. Implementation notes: - 1. Analyzing which tags to remove duplicants - 2. Ensure attributes are merged into the parent tag - 3. Extend the tag exclusion system to specify whether or not the - contents should be dropped or not (currently, there's code that could do - something like this if it didn't drop the inner text too.) - - More user-friendly warnings when %HTML.Allow* attempts to specify a - tag or attribute that is not supported - - Allow specifying global attributes on a tag-by-tag basis in - %HTML.AllowAttributes - - Parse TinyMCE whitelist into our %HTML.Allow* whitelists - - XSS-attempt detection + ? Win32 Phalanger C# binaries Wontfix - Non-lossy smart alternate character encoding transformations (unless diff --git a/configdoc/generate.php b/configdoc/generate.php index 93328356..14335e98 100644 --- a/configdoc/generate.php +++ b/configdoc/generate.php @@ -99,6 +99,8 @@ foreach($schema->info as $namespace_name => $namespace_info) { foreach ($namespace_info as $name => $info) { + if ($info->class == 'alias') continue; + $dom_directive = $dom_document->createElement('directive'); $dom_namespace->appendChild($dom_directive); diff --git a/docs/dev-progress.html b/docs/dev-progress.html index 78156e6e..be35a9b6 100644 --- a/docs/dev-progress.html +++ b/docs/dev-progress.html @@ -60,7 +60,7 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}Standard - background-color COMPOSITE(<color>, transparent) + background SHORTHAND background SHORTHAND, currently alias for background-color border SHORTHAND, MULTIPLE border-color MULTIPLE @@ -145,13 +145,13 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;} border-style MULTIPLE background-image Dangerous, target milestone 1.3 - background-attachment ENUM(scroll, fixed), Depends on background-image + background-position Depends on background-image background-position Depends on background-image cursor Dangerous but fluffy - display ENUM(...), Dangerous but interesting; will not implement list-item, run-in (Opera only) or table (no IE); inline-block has incomplete IE6 support and requires -moz-inline-box for Mozilla. Unknown target milestone. + height Interesting, why use it? Unknown target milestone. height Interesting, why use it? Unknown target milestone. list-style-image Dangerous? max-height No IE 5/6 @@ -231,7 +231,7 @@ Mozilla on inside and needs -moz-outline, no IE support. min-height - CSS + style All Not all properties may be implemented, parser is good though. @@ -266,13 +266,13 @@ Mozilla on inside and needs -moz-outline, no IE support. style All Parser is reasonably functional. Status here doesn't count individual properties. align CAPTION Near-equiv style 'caption-side', drop left and right IMG Margin-left and margin-right = auto or parent div - TABLE + HR Equivalent style 'text-align' (IE tested) HR Near-equivalent style 'text-align' (Works for IE and Opera, but not Firefox). Also try margin-right:auto; margin-left:0;
for left ormargin-right:0; margin-left:auto;
for right (optionally replacing 0 with the original margin for that side)H1, H2, H3, H4, H5, H6, P Equivalent style 'text-align' - alt IMG Required, insert image filename if src is present or default invalid image text - bgcolor TABLE Equivalent style 'background-color' (IE tested) + TR Equivalent style 'background-color' (IE tested) + bgcolor TABLE Equivalent style 'background-color' TR Equivalent style 'background-color' - TD, TH Equivalent style 'background-color' + border IMG Equivalent style 'border-width', only applies when link present border IMG Near equivalent style 'border-width', as it only applies when link present clear BR Near-equiv style 'clear', transform 'all' into 'both' compact DL, OL, UL Boolean, needs custom CSS class; rarely used anyway diff --git a/docs/enduser-security.txt b/docs/enduser-security.txt index 695853d5..e7c9a8ce 100644 --- a/docs/enduser-security.txt +++ b/docs/enduser-security.txt @@ -7,6 +7,7 @@ and it's up to you to provide it the proper information and proper context to be effective. Things to remember: 1. Character Encoding: UTF-8. + This segment will soon be obsoleted by enduser-utf8.html Currently, the parser runs under the assumption that it is dealing with UTF-8. Not ISO-8859-1 or Windows-1252, UTF-8. And definitely not "no character encoding explicitly stated" or UTF-7. If you're not using UTF-8 as @@ -27,6 +28,7 @@ this may be configurable in the future. Do you want standards compliance? The doctype is a good place to start. 3. IDs + This segment is obsoleted by enduser-id.html They need to be unique, but without some knowledge of the rest of the document, it's difficult to know what's unique. %Attr.IDBlacklist needs to be set: we may want to consider disallowing IDs by default to diff --git a/docs/enduser-youtube.html b/docs/enduser-youtube.html index c70d7b44..0cfd3587 100644 --- a/docs/enduser-youtube.html +++ b/docs/enduser-youtube.html @@ -172,9 +172,10 @@ introduced after it has finished. dir BDO Required, insert ltr (or configuration value) if none Future plans
-It would probably be a good idea if this code was added to the core -library. Look out for the inclusion of this into the core as a decorator -or the like.
+This functionality is part of the core library, using the +HTMLPurifier_Filter class to acheive the desired effect. Our implementation +is slightly different, and this page will be updated to reflect that +once 1.4.0 is released.