0
0
mirror of https://github.com/ezyang/htmlpurifier.git synced 2025-03-11 17:18:44 +00:00
Edward Z. Yang 0767bbc12d Rewrite FixNesting implementation to be tree-based.
This mega-patch rips out the FixNesting implementation and the related
ChildDef components.  The primary algorithmic change is to convert from
use of tokens to tree nodes, which are far more amenable to the style
of processing that FixNesting uses.  Additionally, FixNesting has been
changed to go bottom-up rather than top-down, in order to avoid needing
to implement backtracking.

This patch simplifies a good deal of the relevant logic, since we no
longer need to continually recalculate the nesting structure when
processing things.  However, the conversion to the alternate format
incurs some overhead, so for small inputs these changes are not a win.
One possibility to greatly reduce the constant factors here is to switch
to entirely using libxml's representation, and never serializing tokens;
this would require one to rewrite injectors, however.

The iterative post-order traversal in FixNesting is a bit subtle, but
we have essentially reified the stack and continuations.

We've removed support for %Core.EscapeInvalidChildren.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2013-10-20 22:37:01 -07:00
2013-10-13 00:18:11 -07:00
2008-12-06 04:24:59 -05:00
2013-02-17 16:04:08 -08:00
2013-02-17 16:04:08 -08:00
2008-12-06 04:24:59 -05:00
2008-12-06 04:24:59 -05:00
2008-12-06 04:24:59 -05:00
2013-02-17 16:04:08 -08:00
2013-02-17 16:04:08 -08:00
2013-02-17 16:04:08 -08:00
2008-12-06 04:24:59 -05:00

README
    All about HTML Purifier

HTML Purifier is an HTML filtering solution that uses a unique combination
of robust whitelists and agressive parsing to ensure that not only are
XSS attacks thwarted, but the resulting HTML is standards compliant.

HTML Purifier is oriented towards richly formatted documents from
untrusted sources that require CSS and a full tag-set.  This library can
be configured to accept a more restrictive set of tags, but it won't be
as efficient as more bare-bones parsers. It will, however, do the job
right, which may be more important.

Places to go:

* See INSTALL for a quick installation guide
* See docs/ for developer-oriented documentation, code examples and
  an in-depth installation guide.
* See WYSIWYG for information on editors like TinyMCE and FCKeditor

HTML Purifier can be found on the web at: http://htmlpurifier.org/

    vim: et sw=4 sts=4
Description
Standards compliant HTML filter written in PHP.
http://htmlpurifier.org
Readme LGPL-2.1 9.9 MiB
Languages
PHP 94.5%
HTML 4.6%
XSLT 0.5%
CSS 0.3%
JavaScript 0.1%