Edward Z. Yang
54477c172b
Fix infinite loop in Lexer.
...
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2013-10-27 21:41:08 -07:00
Marcus Bointon
fac747bdbd
PSR-2 reformatting PHPDoc corrections
...
With minor corrections.
Signed-off-by: Marcus Bointon <marcus@synchromedia.co.uk>
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2013-08-17 22:27:26 -04:00
Edward Z. Yang
00c66fa9cb
Fix bug in parsing single attribute with entities.
...
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-31 19:44:18 -07:00
Edward Z. Yang
86ca784da3
Convert all to new configuration get/set format.
...
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-02-21 03:00:34 -05:00
Edward Z. Yang
12b811d749
Add vim modelines to all files.
...
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-12-06 04:24:59 -05:00
Edward Z. Yang
2c955af135
Remove trailing whitespace.
...
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-12-06 02:28:20 -05:00
Edward Z. Yang
6a06b92f0c
Setup ErrorCollector to maintain new error format, and output that HTML.
...
Also changed:
- DirectLex keeps track of column numbers in context
- New class HTMLPurifier_ErrorStruct
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-09-15 19:08:58 -04:00
Edward Z. Yang
ed7983b559
Refactor lexer instantiation logic with exceptions and forced line tracking.
...
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-09-05 14:04:23 -04:00
Edward Z. Yang
c6914dce51
Track column numbers in addition to line numbers.
...
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-09-01 14:10:10 -04:00
Edward Z. Yang
aa0fdeee30
Refine Lexers for parsing stray angled brackets; %Core.AggressivelyFixLt = true
...
By default, the DirectLex and DOMLex behavior with stray angled brackets
varied a great deal due to their implementations. A little known directive
%Core.AggressivelyFixLt attempted to match DOMLex's behavior with DirectLex's,
but it was off by default. By turning it on by default, users now enjoy these
benefits, and performance-minded users can turn it back off.
Also, several refinements to stray angled bracket parsing was made. Specifically:
* DirectLex: Handle each left angled bracket individually, which prevents
strange behavior as reported by eon.
* DOMLex: Iterate aggressive lt fix, so that stacked brackets like << are
handled.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-07-07 08:52:29 -04:00
Edward Z. Yang
9f1e678b48
[3.1.0] Fixed fatal error in PH5P lexer with invalid tag names
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1650 48356398-32a2-884e-a903-53898d9a118a
2008-04-05 04:28:37 +00:00
Edward Z. Yang
cb793cd9b9
- Restore substr_count compatibility method; it's not just PHP 4
...
- Update missing includes
- Fix generate-standalone.php fatal error
- Make LexerTest resilient against variant versions of libxml
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1573 48356398-32a2-884e-a903-53898d9a118a
2008-02-20 01:28:19 +00:00
Edward Z. Yang
6c9c8f2380
[3.1.0] [BACKPORT] Fix bug with comments in styles, and some associated issues
...
- Restore printTokens()
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1570 48356398-32a2-884e-a903-53898d9a118a
2008-02-20 00:15:44 +00:00
Edward Z. Yang
35f8b3c801
Transition is complete! Cleanup and class rearrangement now necessary.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1539 48356398-32a2-884e-a903-53898d9a118a
2008-02-10 20:34:39 +00:00
Edward Z. Yang
522c8ed7c2
[3.1.0] The bulk of autoload support added
...
- Add FSTools:globr()
- require_once removed from all files
- HTMLPurifier.autoload.php added to register autoload handler
- Removed redundant chdir in maintenance script
- Modified standalone to use HTMLPurifier.includes.php for including stuff
- Added maintenance script remove-require-once.php which we used once and should never use again
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1516 48356398-32a2-884e-a903-53898d9a118a
2008-01-27 01:54:41 +00:00
Edward Z. Yang
a7fab00cdd
[3.0.0] Convert all $context calls away from references
...
- Update TODO list
- URISchemeRegistry doesn't return a reference for instance anymore, should do the same for other singletons
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1477 48356398-32a2-884e-a903-53898d9a118a
2008-01-05 00:10:43 +00:00
Edward Z. Yang
43f01925cd
Convert to PHP 5 only codebase, adding visibility modifiers to all members and methods in the main library area (function only for test methods)
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1458 48356398-32a2-884e-a903-53898d9a118a
2007-11-25 02:24:39 +00:00
Edward Z. Yang
1274cfed49
[2.1.3] Fix possible error in DirectLex reported by Nate Abele
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1438 48356398-32a2-884e-a903-53898d9a118a
2007-11-05 03:22:22 +00:00
Edward Z. Yang
24a4dfdf83
[2.1.2?] Fix invisible DirectLex parsing error with empty elements that have attributes containing slashes
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1375 48356398-32a2-884e-a903-53898d9a118a
2007-08-08 05:05:30 +00:00
Edward Z. Yang
e7e81c0a5b
[2.1.0] Fix some minor DirectLex bugs that may lead to PHP errors
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1310 48356398-32a2-884e-a903-53898d9a118a
2007-07-05 21:29:07 +00:00
Edward Z. Yang
a6ede3804e
[2.1.0] True emoticon < fix.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1260 48356398-32a2-884e-a903-53898d9a118a
2007-06-27 16:40:18 +00:00
Edward Z. Yang
e99520ab96
Remove trailing ?> in PHP library files, add trailing newlines to all other files.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1253 48356398-32a2-884e-a903-53898d9a118a
2007-06-27 13:58:32 +00:00
Edward Z. Yang
275932ec05
[2.0.1] Fix DirectLex's incomprehension of un-armored script contents as CDATA using custom preg_replace_callback
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1244 48356398-32a2-884e-a903-53898d9a118a
2007-06-26 16:08:42 +00:00
Edward Z. Yang
7a8edc88f9
[2.0.1] Implement error collection for RemoveForeignElements.
...
- Register Generator context variable.
- Implement special substitutions for error collector.
- Also sort by order the errors came in.
- Fix line number determination bug in Lexer::create().
- Remove vestigial variables.
- Force all tag transforms to use copy(), implement serialize, unserialize algorithm for copy() in tokens.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1238 48356398-32a2-884e-a903-53898d9a118a
2007-06-26 02:49:21 +00:00
Edward Z. Yang
98b4e70a93
[2.0.1] Rewire line numbering so that if it's null it's autodetected based on error collection. also, update TODO.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1237 48356398-32a2-884e-a903-53898d9a118a
2007-06-25 23:22:35 +00:00
Edward Z. Yang
6f5592ae60
[2.0.1] Normalize newlines to \n for internal processing.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1235 48356398-32a2-884e-a903-53898d9a118a
2007-06-25 19:18:55 +00:00
Edward Z. Yang
0e9904a9ba
Factor out DirectLex error testing to its own class.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1230 48356398-32a2-884e-a903-53898d9a118a
2007-06-25 01:56:00 +00:00
Edward Z. Yang
728088f2ba
[2.0.1] Rather than pass line number by parameter, have it be retrieved via Context. Add $ignore_error boolean to get().
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1228 48356398-32a2-884e-a903-53898d9a118a
2007-06-25 01:08:57 +00:00
Edward Z. Yang
8ae2604440
[2.0.1] Start making more moves towards full error reporting. Revise message naming conventions. Fix variable assignment for error collecting. Revise Language interface to be as readable as possible (NOT compact). Add error reporting to DirectLex. Rewrite ErrorCollector.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1227 48356398-32a2-884e-a903-53898d9a118a
2007-06-25 00:48:26 +00:00
Edward Z. Yang
dda4038446
[2.0.1] Reorder definition cache includes
...
- Update some comments, rename some variables
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1196 48356398-32a2-884e-a903-53898d9a118a
2007-06-21 23:56:19 +00:00
Edward Z. Yang
bf0d659c47
[2.0.1] Improve special case handling for <script>
...
- DirectLex now honors comments with greater than or less than signs in them
- Comments are transformed into script elements, ending comments are scrapped
- Buggy generator code rewritten to be more error-proof
- AttrValidator checks if token has attributes before processing
- Remove invalid documentation from Scripting
- "Commenting" of script elements switched to the more advanced version
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1189 48356398-32a2-884e-a903-53898d9a118a
2007-06-21 14:44:26 +00:00
Edward Z. Yang
cf7a50163c
Officially transition from 1.7 -> 2.0, mass substitution. Also, wrote WHATSNEW. We are in feature-freeze!
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1172 48356398-32a2-884e-a903-53898d9a118a
2007-06-20 03:00:36 +00:00
Edward Z. Yang
c3094275ef
Fix PHP4 compatibility problems with substr_count
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1163 48356398-32a2-884e-a903-53898d9a118a
2007-06-19 01:20:00 +00:00
Edward Z. Yang
4bf15de536
[1.7.0] Implement line number counting in DirectLex, in preparation for error reporting
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1155 48356398-32a2-884e-a903-53898d9a118a
2007-06-18 02:01:01 +00:00
Edward Z. Yang
ac3ab2a556
[1.6.1] DirectLex now preserves text in which a < bracket is followed by a non-alphanumeric character. This means that certain emoticons are now preserved.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@939 48356398-32a2-884e-a903-53898d9a118a
2007-04-04 02:22:27 +00:00
Edward Z. Yang
d886ed59fd
[1.3.1] Standardized all attribute handling variables to attr, made it plural
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@600 48356398-32a2-884e-a903-53898d9a118a
2006-12-06 22:29:08 +00:00
Edward Z. Yang
8f515b9cda
[1.2.0]
...
- Partially finished migrating to new Context object (done in r485).
- Created HTMLPurifier_Harness to assist with testing, ChildDefTest migrated to that framework.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@484 48356398-32a2-884e-a903-53898d9a118a
2006-10-01 20:47:07 +00:00
Edward Z. Yang
37def0104b
[1.1.2]
...
- Documentation updated
- API docs now exclude more files that are not classes
- Fixed lack of attribute parsing in HTMLPurifier_Lexer_PEARSax3
- (internal) Refactored parseData() to general Lexer class
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@466 48356398-32a2-884e-a903-53898d9a118a
2006-09-27 02:09:54 +00:00
Edward Z. Yang
0ac97774d4
More refactoring: bundling charset and entity stuff together makes little sense, so new HTMLPurifier/EntityParser.php.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@341 48356398-32a2-884e-a903-53898d9a118a
2006-08-30 02:21:39 +00:00
Edward Z. Yang
89376a11e3
Remove a huge swath of duplicated function calls by factoring them into a normalize() function. Also made DirectLex's variable names consistent with the rest of the classes.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@340 48356398-32a2-884e-a903-53898d9a118a
2006-08-29 20:05:26 +00:00
Edward Z. Yang
1de3088276
Refactor encoding and entity specific processing to HTMLPurifier_Encoder. We also need to refactor the escaping to this class too.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@339 48356398-32a2-884e-a903-53898d9a118a
2006-08-29 19:36:40 +00:00
Edward Z. Yang
973cc43b64
Malformed UTF-8 and non-SGML character detection and cleaning implemented
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@303 48356398-32a2-884e-a903-53898d9a118a
2006-08-19 17:53:59 +00:00
Edward Z. Yang
9a35dfa6b9
Add support for full document parsing, aka discard everything that's not in-between body if applicable.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@258 48356398-32a2-884e-a903-53898d9a118a
2006-08-15 00:53:24 +00:00
Edward Z. Yang
d7140f2e05
Outfit a bunch of other classes so they can accept a configuration object. Put in basic scaffolding for extractBody() functionality.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@257 48356398-32a2-884e-a903-53898d9a118a
2006-08-15 00:31:12 +00:00
Edward Z. Yang
609977f9f5
Add CDATA support to the Lexers, as well as give PEARSax3 entity replacement.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@106 48356398-32a2-884e-a903-53898d9a118a
2006-07-23 23:04:34 +00:00
Edward Z. Yang
5ce0ae7056
Implement EntityLookup and put in the Lexer. Some behavior was migrated, since it looks like it will have to be used in all Lexers, not just DirectLex (which is the only one that uses it).
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@105 48356398-32a2-884e-a903-53898d9a118a
2006-07-23 21:07:30 +00:00
Edward Z. Yang
bcc2b09ac7
Finish documenting PEARSax3, touch up the other docs. Nuke the original lexer.txt document.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@102 48356398-32a2-884e-a903-53898d9a118a
2006-07-23 18:56:00 +00:00
Edward Z. Yang
2fa1161d3d
- Implemented special entity conversion.
...
- Optimized and documented DirectLex.
- Rearranged test cases.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@100 48356398-32a2-884e-a903-53898d9a118a
2006-07-23 18:13:04 +00:00
Edward Z. Yang
14f481bcf6
svn:eol-style = native
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@97 48356398-32a2-884e-a903-53898d9a118a
2006-07-23 00:11:03 +00:00
Edward Z. Yang
ca1aefe271
Commit various optimizations to the Lexer, and add stub file for profiling the lexer.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@92 48356398-32a2-884e-a903-53898d9a118a
2006-07-22 22:48:07 +00:00