0
0
mirror of https://github.com/ezyang/htmlpurifier.git synced 2024-09-20 11:15:18 +00:00
Commit Graph

59 Commits

Author SHA1 Message Date
Edward Z. Yang
1bed8b6d5f Added %Core.RemoveProcessingInstructions.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-06-20 18:26:44 -07:00
Edward Z. Yang
33afd7d9e0 Fix improper handling of IE conditional comments.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-06-18 06:08:54 -07:00
Edward Z. Yang
1a70bffd5a Emit errors when body is extracted.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-04 13:41:09 -04:00
Edward Z. Yang
ba9fd175d7 Make extractBody not terminate prematurely on first </body>.
Previously, if two </body> tags were present, HTML Purifier
would truncate everything after the first </body>.  This is
not ideal behavior; so HTML Purifier has been changed to
match up to the last </body>.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-07-07 22:19:04 -04:00
Edward Z. Yang
86ca784da3 Convert all to new configuration get/set format.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-02-21 03:00:34 -05:00
Edward Z. Yang
12b811d749 Add vim modelines to all files.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-12-06 04:24:59 -05:00
Edward Z. Yang
2c955af135 Remove trailing whitespace.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-12-06 02:28:20 -05:00
Edward Z. Yang
d304c5c976 Detect if domxml extension is loaded, and use DirectLex accordingly.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-10-08 17:06:10 -04:00
Edward Z. Yang
ed7983b559 Refactor lexer instantiation logic with exceptions and forced line tracking.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-09-05 14:04:23 -04:00
Edward Z. Yang
0423985b45 Detect if HTML support in DOM is disabled by checking loadHTML().
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-08-07 18:44:21 -04:00
Edward Z. Yang
1d90bb2397 Allow <![CDATA[<body>...</body>]]> not to trigger Core.ConvertDocumentToFragment
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-08-01 19:06:28 -04:00
Edward Z. Yang
eaabccdd9b [3.1.0] More PHP4->PHP5 conversions, notably reference removal of most methods that return objects
- Removed HTMLPurifier_Error
- Documentation updates
- Removed more copy() methods in favor of clone
- HTMLPurifier::getInstance() to HTMLPurifier::instance()
- Fix InterchangeBuilder to use HTMLPURIFIER_PREFIX

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1689 48356398-32a2-884e-a903-53898d9a118a
2008-04-23 02:40:17 +00:00
Edward Z. Yang
6c9c8f2380 [3.1.0] [BACKPORT] Fix bug with comments in styles, and some associated issues
- Restore printTokens()

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1570 48356398-32a2-884e-a903-53898d9a118a
2008-02-20 00:15:44 +00:00
Edward Z. Yang
35f8b3c801 Transition is complete! Cleanup and class rearrangement now necessary.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1539 48356398-32a2-884e-a903-53898d9a118a
2008-02-10 20:34:39 +00:00
Edward Z. Yang
522c8ed7c2 [3.1.0] The bulk of autoload support added
- Add FSTools:globr()
- require_once removed from all files
- HTMLPurifier.autoload.php added to register autoload handler
- Removed redundant chdir in maintenance script
- Modified standalone to use HTMLPurifier.includes.php for including stuff
- Added maintenance script remove-require-once.php which we used once and should never use again

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1516 48356398-32a2-884e-a903-53898d9a118a
2008-01-27 01:54:41 +00:00
Edward Z. Yang
a7fab00cdd [3.0.0] Convert all $context calls away from references
- Update TODO list
- URISchemeRegistry doesn't return a reference for instance anymore, should do the same for other singletons

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1477 48356398-32a2-884e-a903-53898d9a118a
2008-01-05 00:10:43 +00:00
Edward Z. Yang
5b3431d889 [3.0.0] Fully implement CSS extraction and cleaning. See NEWS for more information, it is now a Filter.
- Some Lexer things were moved around

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1469 48356398-32a2-884e-a903-53898d9a118a
2007-12-12 21:46:30 +00:00
Edward Z. Yang
831f552ec5 [3.0.0] <style> tags can now be extracted from input HTML using %HTML.ExtractStyleBlocks. These contents can be retrieved from $context->get('StyleBlocks');
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1468 48356398-32a2-884e-a903-53898d9a118a
2007-12-12 03:29:12 +00:00
Edward Z. Yang
3ef9bdf8a2 __construct'ify all main library classes.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1459 48356398-32a2-884e-a903-53898d9a118a
2007-11-29 04:29:51 +00:00
Edward Z. Yang
43f01925cd Convert to PHP 5 only codebase, adding visibility modifiers to all members and methods in the main library area (function only for test methods)
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1458 48356398-32a2-884e-a903-53898d9a118a
2007-11-25 02:24:39 +00:00
Edward Z. Yang
ccca8cc34f [2.1.3] Rename configuration directive
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1416 48356398-32a2-884e-a903-53898d9a118a
2007-09-09 01:35:50 +00:00
Edward Z. Yang
cb92a57e4e [2.1.2] Implement experimental HTML5 parsing using PH5P
- Fix debugger so that tokens can be printed without an index
- Fix some broken PEAR unit tests

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1383 48356398-32a2-884e-a903-53898d9a118a
2007-08-19 18:49:35 +00:00
Edward Z. Yang
710820cbe9 [2.1.0] Repair minor PHP4 regression due to undefined configuration directive
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1338 48356398-32a2-884e-a903-53898d9a118a
2007-08-02 01:48:43 +00:00
Edward Z. Yang
e99520ab96 Remove trailing ?> in PHP library files, add trailing newlines to all other files.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1253 48356398-32a2-884e-a903-53898d9a118a
2007-06-27 13:58:32 +00:00
Edward Z. Yang
275932ec05 [2.0.1] Fix DirectLex's incomprehension of un-armored script contents as CDATA using custom preg_replace_callback
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1244 48356398-32a2-884e-a903-53898d9a118a
2007-06-26 16:08:42 +00:00
Edward Z. Yang
7a8edc88f9 [2.0.1] Implement error collection for RemoveForeignElements.
- Register Generator context variable.
- Implement special substitutions for error collector.
- Also sort by order the errors came in.
- Fix line number determination bug in Lexer::create().
- Remove vestigial variables.
- Force all tag transforms to use copy(), implement serialize, unserialize algorithm for copy() in tokens.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1238 48356398-32a2-884e-a903-53898d9a118a
2007-06-26 02:49:21 +00:00
Edward Z. Yang
98b4e70a93 [2.0.1] Rewire line numbering so that if it's null it's autodetected based on error collection. also, update TODO.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1237 48356398-32a2-884e-a903-53898d9a118a
2007-06-25 23:22:35 +00:00
Edward Z. Yang
6f5592ae60 [2.0.1] Normalize newlines to \n for internal processing.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1235 48356398-32a2-884e-a903-53898d9a118a
2007-06-25 19:18:55 +00:00
Edward Z. Yang
bf0d659c47 [2.0.1] Improve special case handling for <script>
- DirectLex now honors comments with greater than or less than signs in them
- Comments are transformed into script elements, ending comments are scrapped
- Buggy generator code rewritten to be more error-proof
- AttrValidator checks if token has attributes before processing
- Remove invalid documentation from Scripting
- "Commenting" of script elements switched to the more advanced version

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1189 48356398-32a2-884e-a903-53898d9a118a
2007-06-21 14:44:26 +00:00
Edward Z. Yang
cf7a50163c Officially transition from 1.7 -> 2.0, mass substitution. Also, wrote WHATSNEW. We are in feature-freeze!
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1172 48356398-32a2-884e-a903-53898d9a118a
2007-06-20 03:00:36 +00:00
Edward Z. Yang
4bf15de536 [1.7.0] Implement line number counting in DirectLex, in preparation for error reporting
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1155 48356398-32a2-884e-a903-53898d9a118a
2007-06-18 02:01:01 +00:00
Edward Z. Yang
bf6ce67fc1 [1.7.0] Prototype-declarations for Lexer removed in favor of configuration determination of Lexer implementations.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1153 48356398-32a2-884e-a903-53898d9a118a
2007-06-17 21:27:39 +00:00
Edward Z. Yang
7a3e06d4d0 [1.7.0] Lexer is now pre-emptively included, with a conditional include for the PHP5 only version.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1090 48356398-32a2-884e-a903-53898d9a118a
2007-05-24 20:36:50 +00:00
Edward Z. Yang
aff4957531 [1.4.x?] Alright, have both PHP5 and DOMDocument requirements for DOMLex checked.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@766 48356398-32a2-884e-a903-53898d9a118a
2007-02-27 23:54:29 +00:00
Edward Z. Yang
2e16c4a968 Replaced version check with functionality check for DOM
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@667 48356398-32a2-884e-a903-53898d9a118a
2007-01-20 15:07:48 +00:00
Edward Z. Yang
ad1169c711 [1.4.0] Make all functions in Encoder static. Affects branches/strict
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@656 48356398-32a2-884e-a903-53898d9a118a
2007-01-18 22:55:44 +00:00
Edward Z. Yang
61f852d429 Merge in PHP5 strict changes that are applicable to PHP4.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@650 48356398-32a2-884e-a903-53898d9a118a
2007-01-16 22:22:08 +00:00
Edward Z. Yang
8f515b9cda [1.2.0]
- Partially finished migrating to new Context object (done in r485).
- Created HTMLPurifier_Harness to assist with testing, ChildDefTest migrated to that framework.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@484 48356398-32a2-884e-a903-53898d9a118a
2006-10-01 20:47:07 +00:00
Edward Z. Yang
37def0104b [1.1.2]
- Documentation updated
- API docs now exclude more files that are not classes
- Fixed lack of attribute parsing in HTMLPurifier_Lexer_PEARSax3
- (internal) Refactored parseData() to general Lexer class

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@466 48356398-32a2-884e-a903-53898d9a118a
2006-09-27 02:09:54 +00:00
Edward Z. Yang
3b30c2ca5b Renamed ConfigDef to ConfigSchema. (Required major internal restructuring but should not affect end-users)
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@424 48356398-32a2-884e-a903-53898d9a118a
2006-09-16 22:36:58 +00:00
Edward Z. Yang
aa0838492e Remove an outdated piece of information from Lexer's configuration documentation.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@399 48356398-32a2-884e-a903-53898d9a118a
2006-09-10 19:56:49 +00:00
Edward Z. Yang
0ac97774d4 More refactoring: bundling charset and entity stuff together makes little sense, so new HTMLPurifier/EntityParser.php.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@341 48356398-32a2-884e-a903-53898d9a118a
2006-08-30 02:21:39 +00:00
Edward Z. Yang
89376a11e3 Remove a huge swath of duplicated function calls by factoring them into a normalize() function. Also made DirectLex's variable names consistent with the rest of the classes.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@340 48356398-32a2-884e-a903-53898d9a118a
2006-08-29 20:05:26 +00:00
Edward Z. Yang
1de3088276 Refactor encoding and entity specific processing to HTMLPurifier_Encoder. We also need to refactor the escaping to this class too.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@339 48356398-32a2-884e-a903-53898d9a118a
2006-08-29 19:36:40 +00:00
Edward Z. Yang
7588068b7b Hacky full docuement parse thingy removed from DOMLex, fixes barfing on full HTML documents.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@328 48356398-32a2-884e-a903-53898d9a118a
2006-08-27 22:06:58 +00:00
Edward Z. Yang
24cde9c891 Revamp configuration files so that more rules can be added, internal organization is more logical, and descriptions are captured.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@327 48356398-32a2-884e-a903-53898d9a118a
2006-08-27 18:49:16 +00:00
Edward Z. Yang
973cc43b64 Malformed UTF-8 and non-SGML character detection and cleaning implemented
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@303 48356398-32a2-884e-a903-53898d9a118a
2006-08-19 17:53:59 +00:00
Edward Z. Yang
2eef708557 Fix syntax error.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@301 48356398-32a2-884e-a903-53898d9a118a
2006-08-19 00:26:33 +00:00
Edward Z. Yang
42ba96e2de Put in cleanUTF8 function, currently unused but will be adapted for our needs.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@300 48356398-32a2-884e-a903-53898d9a118a
2006-08-18 20:06:40 +00:00
Edward Z. Yang
a33cd12f1a Fixed broken multibyte numeric entity conversion in Lexer::substituteNonSpecialEntities()
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@299 48356398-32a2-884e-a903-53898d9a118a
2006-08-18 17:49:33 +00:00