0
0
mirror of https://github.com/ezyang/htmlpurifier.git synced 2024-11-15 01:28:42 +00:00
Commit Graph

28 Commits

Author SHA1 Message Date
Edward Z. Yang
0767bbc12d Rewrite FixNesting implementation to be tree-based.
This mega-patch rips out the FixNesting implementation and the related
ChildDef components.  The primary algorithmic change is to convert from
use of tokens to tree nodes, which are far more amenable to the style
of processing that FixNesting uses.  Additionally, FixNesting has been
changed to go bottom-up rather than top-down, in order to avoid needing
to implement backtracking.

This patch simplifies a good deal of the relevant logic, since we no
longer need to continually recalculate the nesting structure when
processing things.  However, the conversion to the alternate format
incurs some overhead, so for small inputs these changes are not a win.
One possibility to greatly reduce the constant factors here is to switch
to entirely using libxml's representation, and never serializing tokens;
this would require one to rewrite injectors, however.

The iterative post-order traversal in FixNesting is a bit subtle, but
we have essentially reified the stack and continuations.

We've removed support for %Core.EscapeInvalidChildren.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2013-10-20 22:37:01 -07:00
Marcus Bointon
fac747bdbd PSR-2 reformatting PHPDoc corrections
With minor corrections.

Signed-off-by: Marcus Bointon <marcus@synchromedia.co.uk>
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2013-08-17 22:27:26 -04:00
Edward Z. Yang
e41af46a8b Fix broken table content model, easily seen in XHTML1.1
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-12-26 14:49:26 +08:00
Edward Z. Yang
3570c9985a Properly handle nested sublists by folding into previous list item.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-12-26 14:00:34 +08:00
Edward Z. Yang
86ca784da3 Convert all to new configuration get/set format.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-02-21 03:00:34 -05:00
Edward Z. Yang
12b811d749 Add vim modelines to all files.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-12-06 04:24:59 -05:00
Edward Z. Yang
2c955af135 Remove trailing whitespace.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-12-06 02:28:20 -05:00
Edward Z. Yang
5cfecebb33 Fix bug involving whitespace-only nodes. Thanks Eric Wald for reporting.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-12-02 20:13:47 -05:00
Edward Z. Yang
c9b6f125aa Forms implementation for %HTML.Trusted. Some backend changes:
* Added Charsets and Character attribute types
* Fix a heavily recursive form of ContentSets, this allows a content-set
  to include another content-set which includes another content-set, and
  so forth.

Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2008-08-15 18:57:44 -04:00
Edward Z. Yang
a4181a80d7 Various fixes:
- Resync standalone with includes
- Fix error queue code
- Fix heisenbug with flush and certain definition error messages
- Fix premature $end in ini files
- Make $php config variable actually do something

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1659 48356398-32a2-884e-a903-53898d9a118a
2008-04-15 03:33:09 +00:00
Edward Z. Yang
3b5effcb72 - Removed tally(), swallowErrors(), assertNoErrors()
- Removed unnecessary ampersands

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1657 48356398-32a2-884e-a903-53898d9a118a
2008-04-14 03:06:36 +00:00
Edward Z. Yang
fbc595ebed Remove includes from unit tests.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1569 48356398-32a2-884e-a903-53898d9a118a
2008-02-18 04:41:42 +00:00
Edward Z. Yang
43f01925cd Convert to PHP 5 only codebase, adding visibility modifiers to all members and methods in the main library area (function only for test methods)
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1458 48356398-32a2-884e-a903-53898d9a118a
2007-11-25 02:24:39 +00:00
Edward Z. Yang
bb08f679f0 [2.1.3]
- Work around unnecessary DOMElement type-cast in PH5P that caused errors in PHP 5.1
- Work around PHP 4 SimpleTest lack-of-error complaining for one-time-only HTMLDefinition errors, this may indicate problems with error-collecting facilities in PHP 5
- Make ErrorCollectorEMock work in both PHP 4 and PHP 5
. tests/multitest.php allows you to test multiple versions by running tests/index.php through multiple interpreters using `phpv` shell script (you must provide this script!)
. Minor cosmetic change to flush-definition-cache.php: trailing newline is outputted
. Maintenance script for generating PH5P patch added, original PH5P source file also added under version control
. Full unit test runner script title made more descriptive with PHP version

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1440 48356398-32a2-884e-a903-53898d9a118a
2007-11-05 05:01:51 +00:00
Edward Z. Yang
f5371bbad4 [2.1.3]
- Buggy treatment of end tags of elements that have required attributes fixed (does not manifest on default tag-set)
- Spurious internal content reorganization error suppressed
. Error unit tests can now specify the expectation of no errors. Future iterations of the harness will be extremely strict about what errors are allowed

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1424 48356398-32a2-884e-a903-53898d9a118a
2007-10-02 01:19:46 +00:00
Edward Z. Yang
f922285383 More unit test refactoring; remove unnecessary periods from HTMLDefinition error messages
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1374 48356398-32a2-884e-a903-53898d9a118a
2007-08-07 05:38:22 +00:00
Edward Z. Yang
3af6457801 Refactor unit tests to have one logical assertion per method.
- Support executing a single unit tests using __only prefix
- Hook in Email classes to main code, even if they're unused


git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1373 48356398-32a2-884e-a903-53898d9a118a
2007-08-06 06:22:23 +00:00
Edward Z. Yang
e99520ab96 Remove trailing ?> in PHP library files, add trailing newlines to all other files.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1253 48356398-32a2-884e-a903-53898d9a118a
2007-06-27 13:58:32 +00:00
Edward Z. Yang
6f5592ae60 [2.0.1] Normalize newlines to \n for internal processing.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1235 48356398-32a2-884e-a903-53898d9a118a
2007-06-25 19:18:55 +00:00
Edward Z. Yang
e5191b3ada [2.0.1] Scrap auto_close in favor of ChildDef->elements heuristic.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1213 48356398-32a2-884e-a903-53898d9a118a
2007-06-23 20:52:57 +00:00
Edward Z. Yang
b10a380ff4 [2.0.1] Rewire test-cases to swallow errors, not expect them
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1190 48356398-32a2-884e-a903-53898d9a118a
2007-06-21 15:15:02 +00:00
Edward Z. Yang
8bbb73e47d [1.7.0] ChildDef_Custom's regex generation has been improved, removing several false positives
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1173 48356398-32a2-884e-a903-53898d9a118a
2007-06-20 15:54:50 +00:00
Edward Z. Yang
220c150e0a [1.7.0] StrictBlockquote child definition refrains from wrapping whitespace in tags now.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1159 48356398-32a2-884e-a903-53898d9a118a
2007-06-18 19:53:46 +00:00
Edward Z. Yang
c5e33416d3 [1.6.1] Unit tests now use exclusively assertIdentical
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1024 48356398-32a2-884e-a903-53898d9a118a
2007-05-05 20:17:04 +00:00
Edward Z. Yang
77d9e05a07 [1.5.0] Massive refactoring for Blockquote and Chameleon to be more extensible and accommodating of XHTMLDefinition.
- Fixed buggy chameleon-support for ins and del
. Removed context variable ParentType, replaced with IsInline, which
  is false when you're not inline and an integer of the parent that
  caused you to become inline when you are (so possibly zero)
. Removed ElementDef->type in favor of ElementDef->descendants_are_inline
  and HTMLDefinition->content_sets
. StrictBlockquote now reports what elements its supposed to allow,
  rather than what it does allow
. Removed HTMLDefinition->info_flow_elements in favor of
  HTMLDefinition->content_sets['Flow']
. Removed redundant "exclusionary" definitions from DTD roster
. StrictBlockquote now requires a construction parameter as if it
  were an Required ChildDef, this is the "real" set of allowed elements

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@710 48356398-32a2-884e-a903-53898d9a118a
2007-02-04 03:53:57 +00:00
Edward Z. Yang
108df87824 Migrate from assertError to expectError, removed all assertNoErrors()
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@669 48356398-32a2-884e-a903-53898d9a118a
2007-01-20 19:22:55 +00:00
Edward Z. Yang
b1b3377b9c [1.3.0] Huge upgrade, (X)HTML Strict now supported
+ Transparently handles inline elements in block context (blockquote)
! Added GET method to demo for easier validation, added 50kb max input size
! New directive %HTML.BlockWrapper, for block-ifying inline elements
! New directive %HTML.Parent, allows you to only allow inline content
- Added missing type to ChildDef_Chameleon
. ChildDef_Required guards against empty tags
. Lookup table HTMLDefinition->info_flow_elements added
. Added peace-of-mind variable initialization to Strategy_FixNesting

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@560 48356398-32a2-884e-a903-53898d9a118a
2006-11-23 03:23:35 +00:00
Edward Z. Yang
3b26e5dc5b [1.3.0] Refactored ChildDef classes into their own files
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@558 48356398-32a2-884e-a903-53898d9a118a
2006-11-22 18:55:15 +00:00