0
0
mirror of https://github.com/ezyang/htmlpurifier.git synced 2024-09-20 19:25:19 +00:00
Commit Graph

34 Commits

Author SHA1 Message Date
Edward Z. Yang
0767bbc12d Rewrite FixNesting implementation to be tree-based.
This mega-patch rips out the FixNesting implementation and the related
ChildDef components.  The primary algorithmic change is to convert from
use of tokens to tree nodes, which are far more amenable to the style
of processing that FixNesting uses.  Additionally, FixNesting has been
changed to go bottom-up rather than top-down, in order to avoid needing
to implement backtracking.

This patch simplifies a good deal of the relevant logic, since we no
longer need to continually recalculate the nesting structure when
processing things.  However, the conversion to the alternate format
incurs some overhead, so for small inputs these changes are not a win.
One possibility to greatly reduce the constant factors here is to switch
to entirely using libxml's representation, and never serializing tokens;
this would require one to rewrite injectors, however.

The iterative post-order traversal in FixNesting is a bit subtle, but
we have essentially reified the stack and continuations.

We've removed support for %Core.EscapeInvalidChildren.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2013-10-20 22:37:01 -07:00
Edward Z. Yang
b3640e1af6 Add conversion functions for our own tree format.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2013-10-20 15:05:11 -07:00
Marcus Bointon
fac747bdbd PSR-2 reformatting PHPDoc corrections
With minor corrections.

Signed-off-by: Marcus Bointon <marcus@synchromedia.co.uk>
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2013-08-17 22:27:26 -04:00
Edward Z. Yang
631021733b Add %Core.DisableExcludes directive
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2013-02-17 15:47:38 -08:00
Edward Z. Yang
12b811d749 Add vim modelines to all files.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-12-06 04:24:59 -05:00
Edward Z. Yang
2c955af135 Remove trailing whitespace.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-12-06 02:28:20 -05:00
Edward Z. Yang
522c8ed7c2 [3.1.0] The bulk of autoload support added
- Add FSTools:globr()
- require_once removed from all files
- HTMLPurifier.autoload.php added to register autoload handler
- Removed redundant chdir in maintenance script
- Modified standalone to use HTMLPurifier.includes.php for including stuff
- Added maintenance script remove-require-once.php which we used once and should never use again

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1516 48356398-32a2-884e-a903-53898d9a118a
2008-01-27 01:54:41 +00:00
Edward Z. Yang
5eee08c548 [3.1.0] Convert tokens to use instanceof, reducing memory footprint and improving comparison speed.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1509 48356398-32a2-884e-a903-53898d9a118a
2008-01-19 20:23:01 +00:00
Edward Z. Yang
a7fab00cdd [3.0.0] Convert all $context calls away from references
- Update TODO list
- URISchemeRegistry doesn't return a reference for instance anymore, should do the same for other singletons

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1477 48356398-32a2-884e-a903-53898d9a118a
2008-01-05 00:10:43 +00:00
Edward Z. Yang
43f01925cd Convert to PHP 5 only codebase, adding visibility modifiers to all members and methods in the main library area (function only for test methods)
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1458 48356398-32a2-884e-a903-53898d9a118a
2007-11-25 02:24:39 +00:00
Edward Z. Yang
f5371bbad4 [2.1.3]
- Buggy treatment of end tags of elements that have required attributes fixed (does not manifest on default tag-set)
- Spurious internal content reorganization error suppressed
. Error unit tests can now specify the expectation of no errors. Future iterations of the harness will be extremely strict about what errors are allowed

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1424 48356398-32a2-884e-a903-53898d9a118a
2007-10-02 01:19:46 +00:00
Edward Z. Yang
e99520ab96 Remove trailing ?> in PHP library files, add trailing newlines to all other files.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1253 48356398-32a2-884e-a903-53898d9a118a
2007-06-27 13:58:32 +00:00
Edward Z. Yang
a005da8a4c [2.0.1] Add error messages for FixNesting
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1249 48356398-32a2-884e-a903-53898d9a118a
2007-06-26 23:43:28 +00:00
Edward Z. Yang
ee61ffc0d9 Minor test-case refactoring.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1100 48356398-32a2-884e-a903-53898d9a118a
2007-05-27 23:12:17 +00:00
Edward Z. Yang
797d3e0393 [1.7.0] Rewire dependencies, removing redundant includes and adding necessary ones
- Rework descendants_are_inline to have default value as false, ins/del handling now works top-level when parent element is not block
- Remove CleanUTF8OnGeneration, feature didn't even work

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1086 48356398-32a2-884e-a903-53898d9a118a
2007-05-22 00:47:03 +00:00
Edward Z. Yang
77d9e05a07 [1.5.0] Massive refactoring for Blockquote and Chameleon to be more extensible and accommodating of XHTMLDefinition.
- Fixed buggy chameleon-support for ins and del
. Removed context variable ParentType, replaced with IsInline, which
  is false when you're not inline and an integer of the parent that
  caused you to become inline when you are (so possibly zero)
. Removed ElementDef->type in favor of ElementDef->descendants_are_inline
  and HTMLDefinition->content_sets
. StrictBlockquote now reports what elements its supposed to allow,
  rather than what it does allow
. Removed HTMLDefinition->info_flow_elements in favor of
  HTMLDefinition->content_sets['Flow']
. Removed redundant "exclusionary" definitions from DTD roster
. StrictBlockquote now requires a construction parameter as if it
  were an Required ChildDef, this is the "real" set of allowed elements

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@710 48356398-32a2-884e-a903-53898d9a118a
2007-02-04 03:53:57 +00:00
Edward Z. Yang
b73b5100fd [1.3.1] Add defense in depth measure: reject entire node if there is no child definition for the element.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@601 48356398-32a2-884e-a903-53898d9a118a
2006-12-06 22:38:25 +00:00
Edward Z. Yang
925a07b828 [1.3.0] New directives %HTML.AllowedElements and %HTML.AllowedAttributes to let users narrow the set of allowed tags
. Added HTMLPurifier->info_parent_def, parent child processing made special

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@565 48356398-32a2-884e-a903-53898d9a118a
2006-11-23 13:51:19 +00:00
Edward Z. Yang
b1b3377b9c [1.3.0] Huge upgrade, (X)HTML Strict now supported
+ Transparently handles inline elements in block context (blockquote)
! Added GET method to demo for easier validation, added 50kb max input size
! New directive %HTML.BlockWrapper, for block-ifying inline elements
! New directive %HTML.Parent, allows you to only allow inline content
- Added missing type to ChildDef_Chameleon
. ChildDef_Required guards against empty tags
. Lookup table HTMLDefinition->info_flow_elements added
. Added peace-of-mind variable initialization to Strategy_FixNesting

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@560 48356398-32a2-884e-a903-53898d9a118a
2006-11-23 03:23:35 +00:00
Edward Z. Yang
8f515b9cda [1.2.0]
- Partially finished migrating to new Context object (done in r485).
- Created HTMLPurifier_Harness to assist with testing, ChildDefTest migrated to that framework.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@484 48356398-32a2-884e-a903-53898d9a118a
2006-10-01 20:47:07 +00:00
Edward Z. Yang
e440f25bce [1.1] Table child definition made more flexible, will fix up poorly ordered elements
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@417 48356398-32a2-884e-a903-53898d9a118a
2006-09-15 01:52:22 +00:00
Edward Z. Yang
14aeafcf22 De-singleton-ized (HTML|CSS)Definition, tying them to the configuration and making them more amenable to changes.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@350 48356398-32a2-884e-a903-53898d9a118a
2006-08-31 20:33:07 +00:00
Edward Z. Yang
e770d994a7 Rename Definition to HTMLDefinition.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@255 48356398-32a2-884e-a903-53898d9a118a
2006-08-14 21:22:49 +00:00
Edward Z. Yang
238678871e - Fixed lots of bugs
- Defined new directive %Core.EscapeInvalidChildren, for previously commented out functionality
- Removed convenience configuration generation: you *have* to pass it unless you're interfacing with HTMLPurifier
- Homogenized function parameters even when only a few of them are used
- Rewrote unit tests that expected previous behavior
- Introduced configuration object to ChildDef tests

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@243 48356398-32a2-884e-a903-53898d9a118a
2006-08-14 02:46:34 +00:00
Edward Z. Yang
9cec089f97 Profusely comment FixNesting.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@183 48356398-32a2-884e-a903-53898d9a118a
2006-08-07 20:28:12 +00:00
Edward Z. Yang
f0deae1fc0 Update documentation.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@147 48356398-32a2-884e-a903-53898d9a118a
2006-08-03 01:37:28 +00:00
Edward Z. Yang
26733183b7 Add support for hard exclusions that affect all child nodes.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@146 48356398-32a2-884e-a903-53898d9a118a
2006-08-03 01:18:57 +00:00
Edward Z. Yang
aa249be067 Fix chameleon behavior with ins and del.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@145 48356398-32a2-884e-a903-53898d9a118a
2006-08-03 01:03:23 +00:00
Edward Z. Yang
9d411bd5cc - Implement double-checking in Strategy/FixNesting.php, fixes the table bugs.
- Move around child definitions so they make a little more sense (rename to Custom) and also add $allow_empty property to help FixNesting.php determine whether or not to double-check.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@136 48356398-32a2-884e-a903-53898d9a118a
2006-07-31 03:04:57 +00:00
Edward Z. Yang
2b5589c884 Factor some stuff into the Definition, add more docs.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@134 48356398-32a2-884e-a903-53898d9a118a
2006-07-30 22:57:54 +00:00
Edward Z. Yang
558c49a92d Make the definition format much more logical. Begin migrating specification docs to their respective classes.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@133 48356398-32a2-884e-a903-53898d9a118a
2006-07-30 19:11:18 +00:00
Edward Z. Yang
f8eaedb500 Factor out definitions to a ['child'] so that we could assign the ['attr'] definitions separately.
Also, added AttrDef/EnumTest.php

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@127 48356398-32a2-884e-a903-53898d9a118a
2006-07-30 00:54:38 +00:00
Edward Z. Yang
619d5d9bc1 Migrate strategies to separate classes complete.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@119 48356398-32a2-884e-a903-53898d9a118a
2006-07-24 02:49:37 +00:00
Edward Z. Yang
d98f1742ec Extract FixNesting strategy from Definition object.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@113 48356398-32a2-884e-a903-53898d9a118a
2006-07-24 01:54:25 +00:00