0
0
mirror of https://github.com/ezyang/htmlpurifier.git synced 2024-11-14 01:08:41 +00:00
Commit Graph

16 Commits

Author SHA1 Message Date
Edward Z. Yang
0767bbc12d Rewrite FixNesting implementation to be tree-based.
This mega-patch rips out the FixNesting implementation and the related
ChildDef components.  The primary algorithmic change is to convert from
use of tokens to tree nodes, which are far more amenable to the style
of processing that FixNesting uses.  Additionally, FixNesting has been
changed to go bottom-up rather than top-down, in order to avoid needing
to implement backtracking.

This patch simplifies a good deal of the relevant logic, since we no
longer need to continually recalculate the nesting structure when
processing things.  However, the conversion to the alternate format
incurs some overhead, so for small inputs these changes are not a win.
One possibility to greatly reduce the constant factors here is to switch
to entirely using libxml's representation, and never serializing tokens;
this would require one to rewrite injectors, however.

The iterative post-order traversal in FixNesting is a bit subtle, but
we have essentially reified the stack and continuations.

We've removed support for %Core.EscapeInvalidChildren.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2013-10-20 22:37:01 -07:00
Marcus Bointon
fac747bdbd PSR-2 reformatting PHPDoc corrections
With minor corrections.

Signed-off-by: Marcus Bointon <marcus@synchromedia.co.uk>
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2013-08-17 22:27:26 -04:00
Edward Z. Yang
bfe474042f Implement "carryover" functionality, requested by Kinderlehrer <bitweaver@7doves.com>
This commit is a limited implementation of the "active formatting
elements" algorithm implemented in HTML5, which preserves certain
formatting elements such as <a> and <b> when exiting or entering nodes.

Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-12-20 13:06:00 -05:00
Edward Z. Yang
12b811d749 Add vim modelines to all files.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-12-06 04:24:59 -05:00
Edward Z. Yang
2c955af135 Remove trailing whitespace.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-12-06 02:28:20 -05:00
Edward Z. Yang
e013bc9126 Fix bug involving autoclose and inline elements in strict <blockquote>.
The newest autoclose code uses the elements property in whether or not an
element should be closed by a particular tag.  The heuristic is simple; if
the element doesn't allow that tag as a child, it closes the parent
container.  This doesn't work, however, with <blockquote>, which while not
allowing inline styles under Strict doctypes, requires them to be passed
through MakeWellFormed.

The fix was to transition MakeWellFormed to call a method to retrieve the
elements, and then have StrictBlockquote implement a special version of
this method.  Future versions of HTML Purifier may be more flexible in this
regard--further study of the HTML5 specification is required.

Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-08-01 20:52:06 -04:00
Edward Z. Yang
522c8ed7c2 [3.1.0] The bulk of autoload support added
- Add FSTools:globr()
- require_once removed from all files
- HTMLPurifier.autoload.php added to register autoload handler
- Removed redundant chdir in maintenance script
- Modified standalone to use HTMLPurifier.includes.php for including stuff
- Added maintenance script remove-require-once.php which we used once and should never use again

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1516 48356398-32a2-884e-a903-53898d9a118a
2008-01-27 01:54:41 +00:00
Edward Z. Yang
81a4cf6a14 [3.1.0] Preparation for autoload
- Factor out getPath from autoload implementation to its own function
- Ensure extends is on the same line as class for all files

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1515 48356398-32a2-884e-a903-53898d9a118a
2008-01-27 00:06:10 +00:00
Edward Z. Yang
5eee08c548 [3.1.0] Convert tokens to use instanceof, reducing memory footprint and improving comparison speed.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1509 48356398-32a2-884e-a903-53898d9a118a
2008-01-19 20:23:01 +00:00
Edward Z. Yang
a7fab00cdd [3.0.0] Convert all $context calls away from references
- Update TODO list
- URISchemeRegistry doesn't return a reference for instance anymore, should do the same for other singletons

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1477 48356398-32a2-884e-a903-53898d9a118a
2008-01-05 00:10:43 +00:00
Edward Z. Yang
43f01925cd Convert to PHP 5 only codebase, adding visibility modifiers to all members and methods in the main library area (function only for test methods)
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1458 48356398-32a2-884e-a903-53898d9a118a
2007-11-25 02:24:39 +00:00
Edward Z. Yang
e99520ab96 Remove trailing ?> in PHP library files, add trailing newlines to all other files.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1253 48356398-32a2-884e-a903-53898d9a118a
2007-06-27 13:58:32 +00:00
Edward Z. Yang
220c150e0a [1.7.0] StrictBlockquote child definition refrains from wrapping whitespace in tags now.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1159 48356398-32a2-884e-a903-53898d9a118a
2007-06-18 19:53:46 +00:00
Edward Z. Yang
236159242f Enforce info_ prefix convention for data that is accessed by HTML Purifier internals.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@720 48356398-32a2-884e-a903-53898d9a118a
2007-02-04 22:08:51 +00:00
Edward Z. Yang
77d9e05a07 [1.5.0] Massive refactoring for Blockquote and Chameleon to be more extensible and accommodating of XHTMLDefinition.
- Fixed buggy chameleon-support for ins and del
. Removed context variable ParentType, replaced with IsInline, which
  is false when you're not inline and an integer of the parent that
  caused you to become inline when you are (so possibly zero)
. Removed ElementDef->type in favor of ElementDef->descendants_are_inline
  and HTMLDefinition->content_sets
. StrictBlockquote now reports what elements its supposed to allow,
  rather than what it does allow
. Removed HTMLDefinition->info_flow_elements in favor of
  HTMLDefinition->content_sets['Flow']
. Removed redundant "exclusionary" definitions from DTD roster
. StrictBlockquote now requires a construction parameter as if it
  were an Required ChildDef, this is the "real" set of allowed elements

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@710 48356398-32a2-884e-a903-53898d9a118a
2007-02-04 03:53:57 +00:00
Edward Z. Yang
b1b3377b9c [1.3.0] Huge upgrade, (X)HTML Strict now supported
+ Transparently handles inline elements in block context (blockquote)
! Added GET method to demo for easier validation, added 50kb max input size
! New directive %HTML.BlockWrapper, for block-ifying inline elements
! New directive %HTML.Parent, allows you to only allow inline content
- Added missing type to ChildDef_Chameleon
. ChildDef_Required guards against empty tags
. Lookup table HTMLDefinition->info_flow_elements added
. Added peace-of-mind variable initialization to Strategy_FixNesting

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@560 48356398-32a2-884e-a903-53898d9a118a
2006-11-23 03:23:35 +00:00