htmlpurifier

mirror of https://github.com/ezyang/htmlpurifier.git synced 2025-04-01 09:37:04 +00:00

Author	SHA1	Message	Date
Edward Z. Yang	3184fee468	Undo start()/end() error collector changes in AttrValidator. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-09-05 17:25:35 -04:00
Edward Z. Yang	ed7983b559	Refactor lexer instantiation logic with exceptions and forced line tracking. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-09-05 14:04:23 -04:00
Edward Z. Yang	c6914dce51	Track column numbers in addition to line numbers. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-09-01 14:10:10 -04:00
Edward Z. Yang	d9e60350d3	Migrate AttrValidator to nested error format; modify generator logic in ErrorCollector. AttrValidator's changes are fairly self-explanatory, but ErrorCollector's changes are worth a little discussion. ErrorCollector can use generators at various points during its flow control; there are two distinct generators that it should use: 1. The one used for the output, and 2. The one used for the error output. These will usually be the same, but in the odd case where they need to be different, getHTMLFormatted() will accept an alterate configuration object with an appropriate doctype. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-18 22:13:58 -04:00
Edward Z. Yang	c807ed5fe2	Implement nested error collection with start() and end() in ErrorCollector. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-16 00:41:34 -04:00
Edward Z. Yang	c9b6f125aa	Forms implementation for %HTML.Trusted. Some backend changes: * Added Charsets and Character attribute types * Fix a heavily recursive form of ContentSets, this allows a content-set to include another content-set which includes another content-set, and so forth. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-15 18:57:44 -04:00
Edward Z. Yang	dc28346677	Fix bug where absolute paths with dots/double-dots were not collapsed. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-15 13:12:54 -04:00
Edward Z. Yang	8423daef05	Increase test coverage for MakeAbsolute. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-13 23:19:38 -04:00
Edward Z. Yang	617f70a8ac	Improve auto-paragraph to preserve newlines and handle edge-cases better. This is a very large commit that includes numerous improvements to the AutoParagraph injector. These are: * Rewritten flow control of the injector to use almost exclusively binary conditionals. * Improved inline documentation with "State" comments, which give concise examples of what the token stack looks like at flow points. * Documentation for all flow branches, even those with no actions. * Factoring out of common operations to improve readability, especially the new iterator private methods. * Expanded test-suite which covers new flow points, and corrects some errors in previous cases. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-10 00:32:29 -04:00
Edward Z. Yang	e013bc9126	Fix bug involving autoclose and inline elements in strict <blockquote>. The newest autoclose code uses the elements property in whether or not an element should be closed by a particular tag. The heuristic is simple; if the element doesn't allow that tag as a child, it closes the parent container. This doesn't work, however, with <blockquote>, which while not allowing inline styles under Strict doctypes, requires them to be passed through MakeWellFormed. The fix was to transition MakeWellFormed to call a method to retrieve the elements, and then have StrictBlockquote implement a special version of this method. Future versions of HTML Purifier may be more flexible in this regard--further study of the HTML5 specification is required. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-01 20:52:06 -04:00
Edward Z. Yang	1d90bb2397	Allow <![CDATA[<body>...</body>]]> not to trigger Core.ConvertDocumentToFragment Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-08-01 19:06:28 -04:00
Edward Z. Yang	85090520f1	Add double-munging protection by checking if the domains are the same. Previously, if an absolute munge URL location was used, HTML passed through HTML Purifier multiple times would be munged multiple times. This patch checks if the output URI has the same URI as the input URI; if they do, the munge is considered unnecessary and discarded. Requested-by: Chris <justbittin@gmail.com> Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-26 22:45:19 -06:00
Edward Z. Yang	3b6aa10592	%URI.DisableExternal(Resources) uses %URI.Base if %URI.Host is not available. As part of its duties, URIDefinition determine the base URL and the host URL of the page based on the two corresponding configuration directives. The DisableExternal URIFilter, however, bypassed this check by directly checking %URI.Host. This fix forwards the call through URIDefinition. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-10 18:46:46 -04:00
Edward Z. Yang	e05bd77344	Implement HTMLT tests, and migrate HTMLPurifierTest to this format. HTMLT tests are a compact and easy-to-use way of making assertPurification type tests. They take the format of: --INI-- Ns.Directive = "directive value" --HTML-- Input HTML --EXPECT-- Expected HTML Expect more features and migration to be coming soon. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-07 08:59:33 -04:00
Edward Z. Yang	334ffac5b4	Various improvements to test script command line options, i.e. --type The following changes were made: * Create --type parameter which accepts 'htmlpurifier', 'phpt', 'vtest', etc. in order to execute only that class of tests. This supercedes --only-phpt. * Create --quick parameter for multitest.php, run only the tips of each release series. * Create --distro parameter for multitest.php, supercedes --exclude-normal and --exclude-standalone. Also, a grep for htmlt tests was added, although add_tests() doesn't do anything with it yet. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-07 08:59:29 -04:00
Edward Z. Yang	a227cb483a	Allow empty sections in string hashes; previously they were left undefined. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-07 08:57:16 -04:00
Edward Z. Yang	aa0fdeee30	Refine Lexers for parsing stray angled brackets; %Core.AggressivelyFixLt = true By default, the DirectLex and DOMLex behavior with stray angled brackets varied a great deal due to their implementations. A little known directive %Core.AggressivelyFixLt attempted to match DOMLex's behavior with DirectLex's, but it was off by default. By turning it on by default, users now enjoy these benefits, and performance-minded users can turn it back off. Also, several refinements to stray angled bracket parsing was made. Specifically: * DirectLex: Handle each left angled bracket individually, which prevents strange behavior as reported by eon. * DOMLex: Iterate aggressive lt fix, so that stacked brackets like << are handled. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-07 08:52:29 -04:00
Edward Z. Yang	ba418a1f19	Redirect stderr to stdout when calling flush.php Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-05 03:15:36 -04:00
Edward Z. Yang	c845f0bb78	Give warnings when attempting to use encoding iconv doesn't support. Previously, attempting to set %Core.Encoding to an encoding iconv didn't know about would result in a silent failure, with the return of the boolean false. Now it will fatally error out. Reported-by: mcgrailm <mgm19@psu.edu> Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-05 03:14:32 -04:00
Edward Z. Yang	594268ca3b	Fix two bugs in MakeAbsolute filter involving base URIs that have empty path. The bugs are: * Undefined $is_folder variable when path is empty, and * Improper concatenation of host and path together. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-05 03:12:44 -04:00
Edward Z. Yang	965be3bd73	Add support for unrecognized elements in MakeWellFormed. The MakeWellFormed strategy uses metadata from HTMLDefinition in order to determine whether or not tokens need to be converted or tags need to be auto-closed. While this functionality is good to have, it is by no means essential, and MakeWellFormed should not error when this information is not available. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-07-05 03:11:29 -04:00
Edward Z. Yang	700d5bcbfc	Implement %AutoFormat.RemoveEmpty, end to start ref, and injector rewind. Injector rewind: Injectors can now use the method rewind() in order to move the input index backwards, so that they can reprocess tokens (other injectors are not affected by a rewind). This functionality was necessary to implement nested node removals in %AutoFormat.RemoveEmpty. End to start ref: To facilitate rewinding, HTMLPurifier_Token_End now maintains a reference called $start to the starting token for their node. %AutoFormat.RemoveEmpty removes empty nodes. Lots of people have requested it, so here is a partially effective implementation. Because it is implemented as an Injector, it's not possible for it to handle newly introduced empty nodes by later validators, specifically auto-closing and child validation. The Injector is only meant to be used on HTML-ish languages. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-06-27 16:09:14 -04:00
Edward Z. Yang	fd384129bf	Proper support for name attribute in <a> and <img> Prior to this commit, the name attribute was unilaterally removed, except for Strict doctypes or a heavy TidyLevel, when it was converted to an id attribute. As name is actually permitted in both HTML 4.01 Strict and XHTML 1.0 Strict, although deprecated, the more sensible default behavior is to allow it unless TidyLevel is heavy. Our implementation is slightly stricter than the specs, as name attributes are treated as first class IDs, disallowing <a name="foo" id="foo"> or duplicate names. The former should be treated as a special case, but that will be a separate commit. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-06-27 15:44:27 -04:00
Edward Z. Yang	a5ceb1e22a	Update printTokens() debug function to work with new Generator API. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-06-27 01:33:20 -04:00
Edward Z. Yang	dba3ed7770	[3.1.2] Implement comments when %HTML.Trusted is on. Some implementation notes: not all comments are valid; HTML makes sure double-hyphens and trailing hyphens are not found in comments. In addition, two new localizable messages were added. Requested-by: Waldo Jaquith <waldo@vqronline.org> Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-06-25 23:12:19 -04:00
Edward Z. Yang	24f6db6fb2	[3.1.2] Add %Output.SortAttr to deal with FCKeditor bug If %Output.SortAttr is true, attributes are sorted to be in alphabetical order. This was requested by frank farmer. See also: http://htmlpurifier.org/phorum/read.php?2,1576 Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-06-24 22:36:27 -04:00
Edward Z. Yang	7727cea112	Add Git specific files and configuration * Setup usage.xml to be binary, as XMLWriter does not honor operating system's newline format. * Setup various files to ignore (svn:ignore was not carried over) * Add dummy files to prevent git from ignoring empty directories Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-06-24 22:02:16 -04:00
Edward Z. Yang	6bb8c1fcac	Handle CRLF discrepancies Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>	2008-06-24 21:10:51 -04:00
Edward Z. Yang	463aa3a0fa	[3.1.1] General munge improvements - Add CurrentCSSProperty context variable - Move Munge to its own class, derived off of SecureMunge. - Rename %URI.SecureMunge to %URI.Munge - Rename %URI.SecureMungeSecretKey to %URI.MungeSecretKey - Add extra substitutions for munge git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1803 48356398-32a2-884e-a903-53898d9a118a	2008-06-18 03:29:27 +00:00
Edward Z. Yang	643ed1bddc	[3.1.1] Fix text-decoration: none bug git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1799 48356398-32a2-884e-a903-53898d9a118a	2008-06-17 03:12:50 +00:00
Edward Z. Yang	486b401cf7	Fix broken tests. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1795 48356398-32a2-884e-a903-53898d9a118a	2008-06-12 03:12:39 +00:00
Edward Z. Yang	36bd06d53e	[3.1.1] Implement SafeEmbed. Also, miscellaneous bugfixes. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1781 48356398-32a2-884e-a903-53898d9a118a	2008-06-10 01:18:03 +00:00
Edward Z. Yang	13eb016e06	[3.1.1] Implement SafeObject. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1780 48356398-32a2-884e-a903-53898d9a118a	2008-06-10 00:13:44 +00:00
Edward Z. Yang	32025a12e1	[3.1.1] Allow injectors to be specified by modules. - Make method for URI implemented - Split out checkNeeded in Injector from prepare git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1779 48356398-32a2-884e-a903-53898d9a118a	2008-06-09 01:23:05 +00:00
Edward Z. Yang	3af2ff8f98	Fix bug with SecureMunge regarding embedded URIs. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1775 48356398-32a2-884e-a903-53898d9a118a	2008-06-02 17:39:29 +00:00
Edward Z. Yang	36fb284d2f	Add integration test, and fix broken SecureMunge git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1774 48356398-32a2-884e-a903-53898d9a118a	2008-05-27 17:47:25 +00:00
Edward Z. Yang	8d1f1e8e73	[3.1.1] Improved adherence to Unicode by checking for non-character codepoints. Thanks Geoffrey Sneddon for reporting. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1773 48356398-32a2-884e-a903-53898d9a118a	2008-05-26 21:27:52 +00:00
Edward Z. Yang	322288e6c0	[3.1.1] Implement %URI.SecureMunge and %URI.SecureMungeSecretKey, thanks Chris! - URIFilter->prepare can return false in order to abort loading of the filter - Implemented post URI filtering. Set member variable $post to true to set a URIFilter as such. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1772 48356398-32a2-884e-a903-53898d9a118a	2008-05-26 16:26:47 +00:00
Edward Z. Yang	14d934c7ca	[3.1.1] Land vs's HTMLPurifier_Generator patch, and a number of other bugfixes for that change - Convert a number of calls to use new constructor signature for Generator - Make generator require configuration; this exposes a number of latent bugs - Removed generator hack - Convert Printers to use new optimized ConfigSchema format - Hack with Printer configuration; pass an array(generator config, render config) to distinguish between output and target. - HTML/CSS Printers need to be primed, otherwise fatal errors - Convert a few test-cases to use member properties git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1770 48356398-32a2-884e-a903-53898d9a118a	2008-05-26 04:05:48 +00:00
Edward Z. Yang	bb16d8eae5	[3.1.1] Fix Shift_JIS encoding wonkiness with yen symbols and whatnot - Improve parseCDATA algorithm to take into account newline normalization - Fix regression in FontFamily validator. We now have a legit parser in place, albeit somewhat limited in use. Will be superseded by parser for entire grammar - Convert EncoderTest to new format git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1769 48356398-32a2-884e-a903-53898d9a118a	2008-05-25 05:40:20 +00:00
Edward Z. Yang	10530d7f81	[3.1.1] Fix stray backslashes in font-family. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1768 48356398-32a2-884e-a903-53898d9a118a	2008-05-24 18:19:36 +00:00
Edward Z. Yang	8ab30e24b7	[3.1.1] Memory optimizations for ConfigSchema. Changes include: - Elimination of ConfigDef and subclasses in favor of stdclass. Most property names stay the same - Added benchmark script for ConfigSchema - Types are internally handled as magic integers. Use HTMLPurifier_VarParser->getTypeName to convert to human readable form. HTMLPurifier_VarParser still accepts strings. - Parser in config schema only used for legacy interface git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1764 48356398-32a2-884e-a903-53898d9a118a	2008-05-23 16:43:24 +00:00
Edward Z. Yang	eb9f9bc7f6	[3.1.1] Round up imagecrash support with HTML.MaxImgLength - Add $max to AttrDef/HTML/Pixels.php - Add %HTML.MaxImgLength - CSS width/height allows percents when MaxImgLength is disabled git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1762 48356398-32a2-884e-a903-53898d9a118a	2008-05-23 02:09:43 +00:00
Edward Z. Yang	8d0d0d1a03	[3.1.1] construct() to setup() in HTMLModules git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1760 48356398-32a2-884e-a903-53898d9a118a	2008-05-22 04:34:19 +00:00
Edward Z. Yang	80f59206d7	[3.1.1] Implement percent encoding for URI query and fragment git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1758 48356398-32a2-884e-a903-53898d9a118a	2008-05-21 02:58:41 +00:00
Edward Z. Yang	1a95852007	[3.1.1] Implement more robust imagecrash protection for CSS width/height. - Change API for HTMLPurifier_AttrDef_CSS_Length - Implement HTMLPurifier_AttrDef_Switch class - Implement HTMLPurifier_Length->compareTo, and make make() accept object instances git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1754 48356398-32a2-884e-a903-53898d9a118a	2008-05-21 01:56:48 +00:00
Edward Z. Yang	c3fab7200e	Add support for pixel as a pseudo-English unit. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1753 48356398-32a2-884e-a903-53898d9a118a	2008-05-21 00:42:55 +00:00
Edward Z. Yang	6d7a17e9b6	Implement without-bcmath compatible UnitConverter. We might want to factor our floating point fudges. These calculations are only accurate for small precisions, and are architecture-dependent. (Unit tests seem to work on 32bit, though). git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1752 48356398-32a2-884e-a903-53898d9a118a	2008-05-21 00:29:31 +00:00
Edward Z. Yang	64b5581bf2	[3.1.1] Have CSS/Length.php use the new Length class. Also, put onus of non-negative to callee, which would compare $n. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1751 48356398-32a2-884e-a903-53898d9a118a	2008-05-20 23:15:20 +00:00
Edward Z. Yang	d8da5ff406	Finally stabilize the unit converter. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1750 48356398-32a2-884e-a903-53898d9a118a	2008-05-20 21:23:38 +00:00

1 2 3 4 5 ...

580 Commits