htmlpurifier

mirror of https://github.com/ezyang/htmlpurifier.git synced 2024-12-22 08:21:52 +00:00

Author	SHA1	Message	Date
Edward Z. Yang	d2de8d976a	Add test for invalid SafeIframe usage. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-12-26 21:52:55 +08:00
Bradley M. Froehle	4164b2eb2b	Implement Iframe module, and provide %HTML.SafeIframe and %URI.SafeIframeRegexp for untrusted usage. The purpose of this addition is twofold. In trusted mode, iframes are now unconditionally allowed. However, many online video providers (YouTube, Vimeo) and other web applications (Google Maps, Google Calendar, etc) provide embed code in iframe format, which is useful functionality in untrusted mode. You can specify iframes as trusted elements with %HTML.SafeIframe; however, you need to additionally specify a whitelist mechanism such as %URI.SafeIframeRegexp to say what iframe embeds are OK (by default everything is rejected). Note: As iframes are invalid in strict doctypes, you will not be able to use them there. We also added an always_load parameter to URIFilters in order to support the strange nature of the SafeIframe URIFilter (it always needs to be loaded, due to the inability of accessing the %HTML.SafeIframe directive to see if it's needed!) We expect this URIFilter can expand in the future to offer more complex validation mechanisms. Signed-off-by: Bradley M. Froehle <brad.froehle@gmail.com> Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-12-26 21:50:53 +08:00
Edward Z. Yang	6b643ede02	Implement %HTML.AllowedComments and %HTML.AllowedCommentsRegexp Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-12-26 15:34:42 +08:00
Edward Z. Yang	e41af46a8b	Fix broken table content model, easily seen in XHTML1.1 Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-12-26 14:49:26 +08:00
Edward Z. Yang	3570c9985a	Properly handle nested sublists by folding into previous list item. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-12-26 14:00:34 +08:00
Edward Z. Yang	8d572993b4	Implement %HTML.TargetBlank Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-12-26 08:36:00 +08:00
Edward Z. Yang	9b10515fa4	Core.EscapeNonASCIICharacters now always works, even if target is UTF-8. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-12-25 23:31:15 +08:00
Edward Z. Yang	d45e11cc6b	Add one more test for SPL autoload defaults. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-12-25 02:58:51 -05:00
Edward Z. Yang	94c15d1f56	Fix iconv truncation bug. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-12-25 02:31:06 -05:00
Edward Z. Yang	820d6e9097	Do not duplicate nofollow attribute in transform. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-08-24 09:56:13 -04:00
Edward Z. Yang	bcfbb8338c	URI.Munge munges https to http URIs. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-04-10 13:09:24 +01:00
Edward Z. Yang	0124605918	Fix CSS URL innerHTML/cssText escaping bug. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-27 21:24:32 +01:00
Edward Z. Yang	afb007d22f	Protect against font family innerHTML/cssText attacks. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-27 20:35:43 +01:00
Edward Z. Yang	0dd9e4faf4	Fix Internet Explorer innerHTML bug. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-27 11:50:52 +01:00
Edward Z. Yang	94ed3b1231	Implement CSS.AllowedFonts. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-24 22:54:39 +00:00
Edward Z. Yang	6a6c0ed5d7	Don't autoclose if no parents support the tag. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-22 00:26:41 +00:00
Edward Z. Yang	e05b555448	Safety update for nested ul test. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-21 21:05:23 +00:00
Edward Z. Yang	ee9c70ab7f	Fix E_NOTICE from indexing into empty string. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-17 17:33:11 +00:00
Edward Z. Yang	e76f4b45d0	Dramatically rewrite null host URI handling. Basically, browsers don't parse what should be valid URIs correctly, so we have to go through some backbends to accomodate them. Specifically, for browseable URIs, the following URIs have unintended behavior: - ///example.com - http:/example.com - http:///example.com Furthermore, if the path begins with //, modifying these URLs must be done with care, as if you remove the host-name component, the parse tree changes. I've modified the engine to follow correct URI semantics as much as possible while outputting browser compatible code, and invalidate the URI in cases where we can't deal. There has been a refactoring of URIScheme so that this important check is always performed, introducing a new member variable allow_empty_host which is true on data, file, mailto and news schemes. This also fixes bypass bugs on URI.Munge. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-01-25 18:56:46 +00:00
Edward Z. Yang	a32d5b52e1	Fix embedding flash on non-IE browsers and allow more wmode. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-01-22 12:28:57 +00:00
Petr Skoda	78c4e62245	Add new Cache.SerializerPermissions option.	2011-01-13 22:57:40 +00:00
Edward Z. Yang	f3d050c517	Fix two bugs with caching of customized raw definitions. The first bug is that we will repeatedly write out the result of a customized raw definition to the filesystem, even when a cache entry already exists. The second bug is that caching these definitions doesn't actually work (the cache entry is written but never used.) A new API for retrieving raw definitions permits the user to take advantage of caching. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-12-30 23:51:53 +00:00
Edward Z. Yang	cfc4ee1faf	Add initial implementation of CSS.Trusted. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-11-12 18:45:03 +00:00
Edward Z. Yang	598c5b60c9	Add sanity check against ze1_compatibility_mode. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-11-12 16:15:03 +00:00
Edward Z. Yang	c9e7ffc172	Fix incorrect PEARSax3 test assertion. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-11-12 16:06:34 +00:00
Edward Z. Yang	4754d407aa	Fix removal of id with DirectLex by preserving armor. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-10-28 17:25:31 +01:00
Nick Pope	0b9db1f54b	Allow non-static autoload methods w/ PHP >= 5.2.11 HTML Purifier loads itself as the first autoload function by unregistering all existing functions and re-registering them after registering itself. Originally an exception was thrown when a non-static object method was encountered as the behaviour of spl_autoload_functions() did not return the object instance, but only the class name. This was filed on PHP bugs (#44144). The bug was fixed for PHP >= 5.2.11 and >= 5.3 Signed-off-by: Nick Pope <nick@nickpope.me.uk> Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-10-28 17:25:17 +01:00
Edward Z. Yang	8c80349f9d	Implement HTML.Nofollow for external links. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-28 12:01:57 -04:00
Edward Z. Yang	d848c99b74	Make IE conditional comment matching ungreedy. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-28 10:22:38 -04:00
Edward Z. Yang	86990a21f1	Rename newline normalization directive to something better. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-15 02:50:39 -04:00
Tomasz Muras	9573f0933d	Make newline normalization optional.	2010-09-14 23:49:28 -04:00
Edward Z. Yang	ec86598446	Add support for file:// URI scheme. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-09 00:01:26 -04:00
Edward Z. Yang	7c91104532	Implement HTML.FlashAllowFullScreen. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-08 23:39:20 -04:00
Edward Z. Yang	eac628f490	Add %CSS.ForbiddenProperties directive. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-04 02:59:03 -04:00
Edward Z. Yang	479d793562	Reword documentation to be clearer, and give warning on common user error. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-04 01:31:20 -04:00
Edward Z. Yang	c04a441b3e	Actually make URI.DisableResources do something. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-06-30 05:59:17 -07:00
Edward Z. Yang	1bed8b6d5f	Added %Core.RemoveProcessingInstructions. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-06-20 18:26:44 -07:00
Edward Z. Yang	33afd7d9e0	Fix improper handling of IE conditional comments. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-06-18 06:08:54 -07:00
Edward Z. Yang	00c66fa9cb	Fix bug in parsing single attribute with entities. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-31 19:44:18 -07:00
Edward Z. Yang	d3abcb90e3	Rewrite CSS url() and font-family output logic. The new logic is as follows: * Given a URL to insert into url(), check that it is properly URL encoded (in particular, a doublequote and backslash never occurs within it) and then place it as url("http://example.com"). * Given a font name, if it is strictly alphanumeric, it is safe to omit quotes. Otherwise, wrap in double quotes and replace '"' with '\22 ' (note trailing space) and '\' with '\5C ' (ditto). We introduce expandCSSEscape() which is a hack for common parsing idioms in CSS; this means that CSS escapes are now recognized inside URLs as well as unquoted font names. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-31 18:45:21 -07:00
Edward Z. Yang	875b0febde	Fix infinite loop involving wrapping formedness. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-17 23:22:51 -04:00
Edward Z. Yang	3166b8a10f	Fix bug in background-position with center keyword. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-05 15:08:57 -04:00
Edward Z. Yang	1a70bffd5a	Emit errors when body is extracted. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-04 13:41:09 -04:00
Edward Z. Yang	c1cbd9e565	Mute STRICT errors from CSSTidy and don't run PEARSax3 on PHP 5.3. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-04-26 18:27:32 -04:00
Edward Z. Yang	da94d3d6ac	Always quote the contents of url() in CSS. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-04-26 12:10:15 -04:00
Edward Z. Yang	70a7a3f5dd	Handle <ol><ol> properly by adding missing <li> tag. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-10 00:58:37 -05:00
Edward Z. Yang	dc90e8e85b	Support flashvars. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-08 01:16:57 -05:00
Edward Z. Yang	97125ed18b	Implement data URI scheme. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-07 21:45:39 -05:00
Paul Stone	9a9036c689	Implement auto-formatter that removes empty span tags. Signed-off-by: Paul Stone <patches@pdjs.co.uk> Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-07 18:59:33 -05:00
Edward Z. Yang	ac18672aba	Fix extant broken PEARSax3 parsing patterns. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-02-26 21:14:52 -05:00
Edward Z. Yang	faf28682ad	Manually work around PEARSax3 E_STRICT errors. Previously, my development environment was not running the PEARSax3 tests because my environment was set to E_STRICT error handling, and thus the tests were skipped. Relax this requirement by making the wrapper class E_STRICT safe. This introduces a few failing tests. Also update TODO and add another fresh test. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-02-26 20:42:42 -05:00
Edward Z. Yang	694583259c	Fix autoparagraph bug with non-inline elements. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-02-15 02:55:33 -05:00
Edward Z. Yang	ba9fd175d7	Make extractBody not terminate prematurely on first </body>. Previously, if two </body> tags were present, HTML Purifier would truncate everything after the first </body>. This is not ideal behavior; so HTML Purifier has been changed to match up to the last </body>. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2009-07-07 22:19:04 -04:00
Edward Z. Yang	4d27906b02	Make %URI.Munge respect %URI.Host (don't munge). %URI.Munge incorrectly munged URIs that pointed to the same host as the current website (it did, however, have the correct behavior for when the munge URL was on the same server). Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2009-07-06 22:04:51 -04:00
Edward Z. Yang	c7594487a2	Fix inability to totally override content model. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-06-10 18:24:52 -04:00
Edward Z. Yang	733a5ce5c3	Fix allowsElement() bug manifesting in LinkifyTest. Thanks frank farmer for reporting. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-06-10 18:11:34 -04:00
Edward Z. Yang	6e66dc9cad	Add HTMLPurifier_config->serialize() Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-05-30 00:25:14 -04:00
Edward Z. Yang	84abae08f5	Relax allowed values of class for certain doctypes, see %Attr.ClassUseCDATA Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-05-26 01:07:40 -04:00
Edward Z. Yang	baf053b016	Implement %Attr.AllowedClasses and %Attr.ForbiddenClasses. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-05-25 22:08:45 -04:00
Edward Z. Yang	bfbe29d5a1	Rename ExtractStyleBlocks configuration parameters. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-05-25 21:54:39 -04:00
Edward Z. Yang	e194b8efc6	Rename AutoFormatParam.PurifierLinkifyDocURL. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-05-25 21:51:08 -04:00
Edward Z. Yang	41c9226f3d	Style refresh: add/remove vimlines, fix minor factual errors. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-04-09 12:47:10 -04:00
Edward Z. Yang	e3c2063f69	Implement %AutoFormat.RemoveEmpty.RemoveNbsp, by popular demand. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-04-09 00:53:19 -04:00
Edward Z. Yang	398a02039e	Implement %HTML.Attr.Name.UseCDATA which relaxes name validation rules. Sponsored-by: Ian Cook <thinkspill@gmail.com> Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-03-20 19:34:38 -04:00
Edward Z. Yang	84e2e141fc	Fix bad configuration call in NameSyncTest.php. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-03-14 19:18:02 -04:00
Edward Z. Yang	eaa906f8fc	Implement configuration inheritance. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-02-21 03:01:02 -05:00
Edward Z. Yang	86ca784da3	Convert all to new configuration get/set format. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-02-21 03:00:34 -05:00
Edward Z. Yang	b107eec452	Revamp configuration backend. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-02-21 03:00:33 -05:00
Edward Z. Yang	fcbf724e6e	Make name="" and id="" play nicely together. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-02-21 02:58:30 -05:00
Edward Z. Yang	e802065b65	Punt Lexer test entirely for 5.0.5. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-02-16 17:18:30 -05:00
Edward Z. Yang	07ed1bbf8c	Fix broken trusted comments functionality. This fix is slightly hackish, as we simply treat comments as whitespace. This should largely be correct, and breaks no current test cases, although it could result in noncompliant behavior. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-02-05 18:04:10 -05:00
Edward Z. Yang	b9094d5ec8	Convert HTMLPurifier_Config to use property list backend. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2009-02-02 18:42:23 -05:00
Edward Z. Yang	bfe474042f	Implement "carryover" functionality, requested by Kinderlehrer <bitweaver@7doves.com> This commit is a limited implementation of the "active formatting elements" algorithm implemented in HTML5, which preserves certain formatting elements such as <a> and <b> when exiting or entering nodes. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-12-20 13:06:00 -05:00
Edward Z. Yang	119ebcda71	Implement user-friendly links to test-cases on web tester. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-12-20 13:01:20 -05:00
Edward Z. Yang	33a873f5cb	Fix missing numbers when pass/fail count is zero. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-12-06 16:08:09 -05:00
Edward Z. Yang	12b811d749	Add vim modelines to all files. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-12-06 04:24:59 -05:00
Edward Z. Yang	2c955af135	Remove trailing whitespace. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-12-06 02:28:20 -05:00
Edward Z. Yang	3a6b63dff1	Generic implementation of property-lists. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-12-06 00:43:42 -05:00
Edward Z. Yang	5cfecebb33	Fix bug involving whitespace-only nodes. Thanks Eric Wald for reporting. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-12-02 20:13:47 -05:00
Edward Z. Yang	f5cd2c07ea	Implement 'overflow' CSS property. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-11-27 16:14:50 -05:00
Edward Z. Yang	6691676666	Fix newline issues in tests. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-11-26 15:30:59 -05:00
Edward Z. Yang	527f154d3d	Add verbose mode to command line test runner. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-11-23 20:45:21 -05:00
Edward Z. Yang	778ddf7c96	Turn on unit tests for UnitConverter. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-11-23 20:43:58 -05:00
Edward Z. Yang	3a2fd0b5db	Improve floating point scaling in UnitConverter. When precision dictates that a number be zero padded, we cannot give sprintf() a negative precision specifier. This commit implements manual negative precision printing of floats, taking into account common rounding errors with floating point numbers. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-10-24 12:50:59 -04:00
David Morton	0b6ae1c3c1	Custom Injector to display URL address along with link text. When viewing potentially hostile html, it may be helpful to see what a given link was pointing to. This new injector takes the href attribute and adds the text after the link, and deletes the href attribute. Other forms of display could easily be contrived, but this seems to be a good basic way to present the information. Signed-off-by: David Morton <mortonda@dgrmm.net> Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-10-23 17:11:29 -04:00
Edward Z. Yang	ab263a0bf1	Rewrite spurious encoding test, as utf8 is sometimes useful. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-10-23 15:22:31 -04:00
Edward Z. Yang	d304c5c976	Detect if domxml extension is loaded, and use DirectLex accordingly. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-10-08 17:06:10 -04:00
Edward Z. Yang	f7bc0b0875	Implement %Attr.DefaultImageAlt, allowing overriding default behavior for alt attributes. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-10-06 14:51:03 -04:00
Edward Z. Yang	70515dd48f	Increase test coverage, and modify handleEnd behavior to only see correct tokens. Previously, handleEnd was called for any end tag, except ones that were obviously spurious because there were no parent tags. Now, it is only called for end tags that are "approved." If an injector operates on the end tag, we automatically punt. There may be some optimizations that could be made to this procedure, but for now it's much more consistent. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-10-01 15:40:31 -04:00
Edward Z. Yang	cd4500457e	More refactoring to MakeWellFormed and Injectors; they work better than ever now! Major paradigm shift in this commit is bailing ship on the "skip" integers, which were extremely buggy and error prone, and simply mark tokens as processed or not processed by injectors. Other notable changes: - Removed ad hoc decrements to inputIndex in favor of $reprocess flag variable - Moved rewind outside of processToken() - Make rewind properly ignore all other injectors - Cleanup end of document code - Reconfigure injector loops to account for skips and rewinds - Punt the empty to start/end transformation - Completely rewrite processToken to be array based - Added skip and rewind member variables to tokens - Fixed a longstanding bug with remove empty! Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-10-01 03:14:28 -04:00
Edward Z. Yang	fa413e96ac	Implement Injector->handleEnd, with lots of refactoring for injector. Previous design of injector streaming involved editability only to start, empty and text tokens, because they could be safely modified without causing formedness errors. By modifying notifyEnd to operate before MakeWellFormed's safeguards kick into effect, it can be converted into a handle function, allowing for arbitrary modification of end tags. This change involved quite a bit of restructuring of the MakeWellFormed code, including the moving of end of document tags to inside the loop, so rewinding on those tags would be functional, increased reuse of the end tag codepath by code that inserts end tags (as they could be changed out from under you), and processToken modified to have an extra parameter to force re-processing of a token if the original token was an end token. We're not exactly sure if handleEnd works at this point, but the important talking point about this refactoring is that nothing else broke. Also, a number of convenience functions were moved from AutoParagraph to the Injector supertype (specifically: forward, forwardToEndToken, backward, and current). Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-10-01 00:54:51 -04:00
Edward Z. Yang	d0fdcc103e	Add support for proprietary "background" attribute in table elements. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-09-27 21:19:35 -04:00
Edward Z. Yang	6a06b92f0c	Setup ErrorCollector to maintain new error format, and output that HTML. Also changed: - DirectLex keeps track of column numbers in context - New class HTMLPurifier_ErrorStruct Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-09-15 19:08:58 -04:00
Edward Z. Yang	3184fee468	Undo start()/end() error collector changes in AttrValidator. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-09-05 17:25:35 -04:00
Edward Z. Yang	ed7983b559	Refactor lexer instantiation logic with exceptions and forced line tracking. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-09-05 14:04:23 -04:00
Edward Z. Yang	c6914dce51	Track column numbers in addition to line numbers. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-09-01 14:10:10 -04:00
Edward Z. Yang	d9e60350d3	Migrate AttrValidator to nested error format; modify generator logic in ErrorCollector. AttrValidator's changes are fairly self-explanatory, but ErrorCollector's changes are worth a little discussion. ErrorCollector can use generators at various points during its flow control; there are two distinct generators that it should use: 1. The one used for the output, and 2. The one used for the error output. These will usually be the same, but in the odd case where they need to be different, getHTMLFormatted() will accept an alterate configuration object with an appropriate doctype. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-18 22:13:58 -04:00
Edward Z. Yang	c807ed5fe2	Implement nested error collection with start() and end() in ErrorCollector. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-16 00:41:34 -04:00
Edward Z. Yang	c9b6f125aa	Forms implementation for %HTML.Trusted. Some backend changes: * Added Charsets and Character attribute types * Fix a heavily recursive form of ContentSets, this allows a content-set to include another content-set which includes another content-set, and so forth. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-15 18:57:44 -04:00
Edward Z. Yang	dc28346677	Fix bug where absolute paths with dots/double-dots were not collapsed. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-15 13:12:54 -04:00
Edward Z. Yang	8423daef05	Increase test coverage for MakeAbsolute. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-13 23:19:38 -04:00
Edward Z. Yang	617f70a8ac	Improve auto-paragraph to preserve newlines and handle edge-cases better. This is a very large commit that includes numerous improvements to the AutoParagraph injector. These are: * Rewritten flow control of the injector to use almost exclusively binary conditionals. * Improved inline documentation with "State" comments, which give concise examples of what the token stack looks like at flow points. * Documentation for all flow branches, even those with no actions. * Factoring out of common operations to improve readability, especially the new iterator private methods. * Expanded test-suite which covers new flow points, and corrects some errors in previous cases. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-10 00:32:29 -04:00
Edward Z. Yang	e013bc9126	Fix bug involving autoclose and inline elements in strict <blockquote>. The newest autoclose code uses the elements property in whether or not an element should be closed by a particular tag. The heuristic is simple; if the element doesn't allow that tag as a child, it closes the parent container. This doesn't work, however, with <blockquote>, which while not allowing inline styles under Strict doctypes, requires them to be passed through MakeWellFormed. The fix was to transition MakeWellFormed to call a method to retrieve the elements, and then have StrictBlockquote implement a special version of this method. Future versions of HTML Purifier may be more flexible in this regard--further study of the HTML5 specification is required. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-01 20:52:06 -04:00
Edward Z. Yang	1d90bb2397	Allow <![CDATA[<body>...</body>]]> not to trigger Core.ConvertDocumentToFragment Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-01 19:06:28 -04:00
Edward Z. Yang	85090520f1	Add double-munging protection by checking if the domains are the same. Previously, if an absolute munge URL location was used, HTML passed through HTML Purifier multiple times would be munged multiple times. This patch checks if the output URI has the same URI as the input URI; if they do, the munge is considered unnecessary and discarded. Requested-by: Chris <justbittin@gmail.com> Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-26 22:45:19 -06:00
Edward Z. Yang	3b6aa10592	%URI.DisableExternal(Resources) uses %URI.Base if %URI.Host is not available. As part of its duties, URIDefinition determine the base URL and the host URL of the page based on the two corresponding configuration directives. The DisableExternal URIFilter, however, bypassed this check by directly checking %URI.Host. This fix forwards the call through URIDefinition. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-10 18:46:46 -04:00
Edward Z. Yang	e05bd77344	Implement HTMLT tests, and migrate HTMLPurifierTest to this format. HTMLT tests are a compact and easy-to-use way of making assertPurification type tests. They take the format of: --INI-- Ns.Directive = "directive value" --HTML-- Input HTML --EXPECT-- Expected HTML Expect more features and migration to be coming soon. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-07 08:59:33 -04:00
Edward Z. Yang	a227cb483a	Allow empty sections in string hashes; previously they were left undefined. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-07 08:57:16 -04:00
Edward Z. Yang	aa0fdeee30	Refine Lexers for parsing stray angled brackets; %Core.AggressivelyFixLt = true By default, the DirectLex and DOMLex behavior with stray angled brackets varied a great deal due to their implementations. A little known directive %Core.AggressivelyFixLt attempted to match DOMLex's behavior with DirectLex's, but it was off by default. By turning it on by default, users now enjoy these benefits, and performance-minded users can turn it back off. Also, several refinements to stray angled bracket parsing was made. Specifically: * DirectLex: Handle each left angled bracket individually, which prevents strange behavior as reported by eon. * DOMLex: Iterate aggressive lt fix, so that stacked brackets like << are handled. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-07 08:52:29 -04:00
Edward Z. Yang	c845f0bb78	Give warnings when attempting to use encoding iconv doesn't support. Previously, attempting to set %Core.Encoding to an encoding iconv didn't know about would result in a silent failure, with the return of the boolean false. Now it will fatally error out. Reported-by: mcgrailm <mgm19@psu.edu> Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-05 03:14:32 -04:00
Edward Z. Yang	594268ca3b	Fix two bugs in MakeAbsolute filter involving base URIs that have empty path. The bugs are: * Undefined $is_folder variable when path is empty, and * Improper concatenation of host and path together. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-05 03:12:44 -04:00
Edward Z. Yang	965be3bd73	Add support for unrecognized elements in MakeWellFormed. The MakeWellFormed strategy uses metadata from HTMLDefinition in order to determine whether or not tokens need to be converted or tags need to be auto-closed. While this functionality is good to have, it is by no means essential, and MakeWellFormed should not error when this information is not available. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-05 03:11:29 -04:00
Edward Z. Yang	700d5bcbfc	Implement %AutoFormat.RemoveEmpty, end to start ref, and injector rewind. Injector rewind: Injectors can now use the method rewind() in order to move the input index backwards, so that they can reprocess tokens (other injectors are not affected by a rewind). This functionality was necessary to implement nested node removals in %AutoFormat.RemoveEmpty. End to start ref: To facilitate rewinding, HTMLPurifier_Token_End now maintains a reference called $start to the starting token for their node. %AutoFormat.RemoveEmpty removes empty nodes. Lots of people have requested it, so here is a partially effective implementation. Because it is implemented as an Injector, it's not possible for it to handle newly introduced empty nodes by later validators, specifically auto-closing and child validation. The Injector is only meant to be used on HTML-ish languages. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-06-27 16:09:14 -04:00
Edward Z. Yang	dba3ed7770	[3.1.2] Implement comments when %HTML.Trusted is on. Some implementation notes: not all comments are valid; HTML makes sure double-hyphens and trailing hyphens are not found in comments. In addition, two new localizable messages were added. Requested-by: Waldo Jaquith <waldo@vqronline.org> Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-06-25 23:12:19 -04:00
Edward Z. Yang	24f6db6fb2	[3.1.2] Add %Output.SortAttr to deal with FCKeditor bug If %Output.SortAttr is true, attributes are sorted to be in alphabetical order. This was requested by frank farmer. See also: http://htmlpurifier.org/phorum/read.php?2,1576 Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-06-24 22:36:27 -04:00
Edward Z. Yang	7727cea112	Add Git specific files and configuration * Setup usage.xml to be binary, as XMLWriter does not honor operating system's newline format. * Setup various files to ignore (svn:ignore was not carried over) * Add dummy files to prevent git from ignoring empty directories Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-06-24 22:02:16 -04:00
Edward Z. Yang	6bb8c1fcac	Handle CRLF discrepancies Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-06-24 21:10:51 -04:00
Edward Z. Yang	463aa3a0fa	[3.1.1] General munge improvements - Add CurrentCSSProperty context variable - Move Munge to its own class, derived off of SecureMunge. - Rename %URI.SecureMunge to %URI.Munge - Rename %URI.SecureMungeSecretKey to %URI.MungeSecretKey - Add extra substitutions for munge git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1803 48356398-32a2-884e-a903-53898d9a118a	2008-06-18 03:29:27 +00:00
Edward Z. Yang	643ed1bddc	[3.1.1] Fix text-decoration: none bug git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1799 48356398-32a2-884e-a903-53898d9a118a	2008-06-17 03:12:50 +00:00
Edward Z. Yang	486b401cf7	Fix broken tests. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1795 48356398-32a2-884e-a903-53898d9a118a	2008-06-12 03:12:39 +00:00
Edward Z. Yang	36bd06d53e	[3.1.1] Implement SafeEmbed. Also, miscellaneous bugfixes. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1781 48356398-32a2-884e-a903-53898d9a118a	2008-06-10 01:18:03 +00:00
Edward Z. Yang	13eb016e06	[3.1.1] Implement SafeObject. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1780 48356398-32a2-884e-a903-53898d9a118a	2008-06-10 00:13:44 +00:00
Edward Z. Yang	32025a12e1	[3.1.1] Allow injectors to be specified by modules. - Make method for URI implemented - Split out checkNeeded in Injector from prepare git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1779 48356398-32a2-884e-a903-53898d9a118a	2008-06-09 01:23:05 +00:00
Edward Z. Yang	3af2ff8f98	Fix bug with SecureMunge regarding embedded URIs. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1775 48356398-32a2-884e-a903-53898d9a118a	2008-06-02 17:39:29 +00:00
Edward Z. Yang	8d1f1e8e73	[3.1.1] Improved adherence to Unicode by checking for non-character codepoints. Thanks Geoffrey Sneddon for reporting. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1773 48356398-32a2-884e-a903-53898d9a118a	2008-05-26 21:27:52 +00:00
Edward Z. Yang	322288e6c0	[3.1.1] Implement %URI.SecureMunge and %URI.SecureMungeSecretKey, thanks Chris! - URIFilter->prepare can return false in order to abort loading of the filter - Implemented post URI filtering. Set member variable $post to true to set a URIFilter as such. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1772 48356398-32a2-884e-a903-53898d9a118a	2008-05-26 16:26:47 +00:00
Edward Z. Yang	14d934c7ca	[3.1.1] Land vs's HTMLPurifier_Generator patch, and a number of other bugfixes for that change - Convert a number of calls to use new constructor signature for Generator - Make generator require configuration; this exposes a number of latent bugs - Removed generator hack - Convert Printers to use new optimized ConfigSchema format - Hack with Printer configuration; pass an array(generator config, render config) to distinguish between output and target. - HTML/CSS Printers need to be primed, otherwise fatal errors - Convert a few test-cases to use member properties git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1770 48356398-32a2-884e-a903-53898d9a118a	2008-05-26 04:05:48 +00:00
Edward Z. Yang	bb16d8eae5	[3.1.1] Fix Shift_JIS encoding wonkiness with yen symbols and whatnot - Improve parseCDATA algorithm to take into account newline normalization - Fix regression in FontFamily validator. We now have a legit parser in place, albeit somewhat limited in use. Will be superseded by parser for entire grammar - Convert EncoderTest to new format git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1769 48356398-32a2-884e-a903-53898d9a118a	2008-05-25 05:40:20 +00:00
Edward Z. Yang	10530d7f81	[3.1.1] Fix stray backslashes in font-family. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1768 48356398-32a2-884e-a903-53898d9a118a	2008-05-24 18:19:36 +00:00
Edward Z. Yang	8ab30e24b7	[3.1.1] Memory optimizations for ConfigSchema. Changes include: - Elimination of ConfigDef and subclasses in favor of stdclass. Most property names stay the same - Added benchmark script for ConfigSchema - Types are internally handled as magic integers. Use HTMLPurifier_VarParser->getTypeName to convert to human readable form. HTMLPurifier_VarParser still accepts strings. - Parser in config schema only used for legacy interface git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1764 48356398-32a2-884e-a903-53898d9a118a	2008-05-23 16:43:24 +00:00
Edward Z. Yang	eb9f9bc7f6	[3.1.1] Round up imagecrash support with HTML.MaxImgLength - Add $max to AttrDef/HTML/Pixels.php - Add %HTML.MaxImgLength - CSS width/height allows percents when MaxImgLength is disabled git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1762 48356398-32a2-884e-a903-53898d9a118a	2008-05-23 02:09:43 +00:00
Edward Z. Yang	8d0d0d1a03	[3.1.1] construct() to setup() in HTMLModules git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1760 48356398-32a2-884e-a903-53898d9a118a	2008-05-22 04:34:19 +00:00
Edward Z. Yang	80f59206d7	[3.1.1] Implement percent encoding for URI query and fragment git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1758 48356398-32a2-884e-a903-53898d9a118a	2008-05-21 02:58:41 +00:00
Edward Z. Yang	1a95852007	[3.1.1] Implement more robust imagecrash protection for CSS width/height. - Change API for HTMLPurifier_AttrDef_CSS_Length - Implement HTMLPurifier_AttrDef_Switch class - Implement HTMLPurifier_Length->compareTo, and make make() accept object instances git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1754 48356398-32a2-884e-a903-53898d9a118a	2008-05-21 01:56:48 +00:00
Edward Z. Yang	c3fab7200e	Add support for pixel as a pseudo-English unit. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1753 48356398-32a2-884e-a903-53898d9a118a	2008-05-21 00:42:55 +00:00
Edward Z. Yang	6d7a17e9b6	Implement without-bcmath compatible UnitConverter. We might want to factor our floating point fudges. These calculations are only accurate for small precisions, and are architecture-dependent. (Unit tests seem to work on 32bit, though). git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1752 48356398-32a2-884e-a903-53898d9a118a	2008-05-21 00:29:31 +00:00
Edward Z. Yang	64b5581bf2	[3.1.1] Have CSS/Length.php use the new Length class. Also, put onus of non-negative to callee, which would compare $n. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1751 48356398-32a2-884e-a903-53898d9a118a	2008-05-20 23:15:20 +00:00
Edward Z. Yang	d8da5ff406	Finally stabilize the unit converter. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1750 48356398-32a2-884e-a903-53898d9a118a	2008-05-20 21:23:38 +00:00
Edward Z. Yang	fda310f1e7	Update UnitConverter to deal more correctly with X.XX... decimals. Not complete. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1749 48356398-32a2-884e-a903-53898d9a118a	2008-05-20 17:48:15 +00:00
Edward Z. Yang	fc7dbdbd33	Disable Tidy test completely. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1748 48356398-32a2-884e-a903-53898d9a118a	2008-05-20 17:14:08 +00:00
Edward Z. Yang	16fa73afa0	[3.1.1] Added HTMLPurifier_UnitConverter and HTMLPurifier_Length for convenient handling of CSS-style lengths. - Fixed another de-underscoring in the SimpleTest library git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1746 48356398-32a2-884e-a903-53898d9a118a	2008-05-20 01:19:00 +00:00
Edward Z. Yang	86b1da9b6f	[3.1.0] Fixed bug with fallback languages in LanguageFactory - Also, reverted bogus Generator changes git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1723 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 23:04:46 +00:00
Edward Z. Yang	00ea2062d4	[3.1.0] Fix buggy LanguageFactory. This revision is incomplete. - Some bogus commits to Generator were made, and will be reverted next revision. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1722 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 17:47:47 +00:00
Edward Z. Yang	cb5d5d0648	[3.1.0] Revamp URI handling of percent encoding and validation. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1709 48356398-32a2-884e-a903-53898d9a118a	2008-05-14 02:19:00 +00:00
Edward Z. Yang	e0c0d8eab6	[3.1.0] Allow arbitrary whitespace in %HTML.Allowed git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1707 48356398-32a2-884e-a903-53898d9a118a	2008-05-13 02:02:27 +00:00
Edward Z. Yang	ce46fb618c	[3.1.0] Add missing tests and errors for forbidden attributes git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1706 48356398-32a2-884e-a903-53898d9a118a	2008-05-13 01:41:25 +00:00
Edward Z. Yang	aaf6ba421c	Sync with SimpleTest codebase git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1701 48356398-32a2-884e-a903-53898d9a118a	2008-04-28 19:52:13 +00:00
Edward Z. Yang	4b862f64e6	[3.1.0] Fix ScriptRequired bug with trusted installs - Generator now takes $config and $context during instantiation - Double quotes outside of attributes are not escaped git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1700 48356398-32a2-884e-a903-53898d9a118a	2008-04-28 01:35:07 +00:00
Edward Z. Yang	2f29c27a59	[3.1.0] Fix broken PH5P in latest versions of DOM with bandaid; punt to DirectLex. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1698 48356398-32a2-884e-a903-53898d9a118a	2008-04-26 19:47:22 +00:00
Edward Z. Yang	144bd6f07a	[3.1.0] Fix bug with 3.1.0-dev version number (the dash caused problems, so we switched to commas) - Refactored out null definition cache during HTMLDefinition tests git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1697 48356398-32a2-884e-a903-53898d9a118a	2008-04-26 19:28:14 +00:00

1 2 3 4 5 ...

675 Commits