mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2024-12-31 20:01:52 +00:00
0767bbc12d
This mega-patch rips out the FixNesting implementation and the related ChildDef components. The primary algorithmic change is to convert from use of tokens to tree nodes, which are far more amenable to the style of processing that FixNesting uses. Additionally, FixNesting has been changed to go bottom-up rather than top-down, in order to avoid needing to implement backtracking. This patch simplifies a good deal of the relevant logic, since we no longer need to continually recalculate the nesting structure when processing things. However, the conversion to the alternate format incurs some overhead, so for small inputs these changes are not a win. One possibility to greatly reduce the constant factors here is to switch to entirely using libxml's representation, and never serializing tokens; this would require one to rewrite injectors, however. The iterative post-order traversal in FixNesting is a bit subtle, but we have essentially reified the stack and continuations. We've removed support for %Core.EscapeInvalidChildren. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
50 lines
1.3 KiB
PHP
50 lines
1.3 KiB
PHP
<?php
|
|
|
|
/**
|
|
* Abstract base node class that all others inherit from.
|
|
*
|
|
* Why do we not use the DOM extension? (1) It is not always available,
|
|
* (2) it has funny constraints on the data it can represent,
|
|
* whereas we want a maximally flexible representation, and (3) its
|
|
* interface is a bit cumbersome.
|
|
*/
|
|
abstract class HTMLPurifier_Node
|
|
{
|
|
/**
|
|
* Line number of the start token in the source document
|
|
* @type int
|
|
*/
|
|
public $line;
|
|
|
|
/**
|
|
* Column number of the start token in the source document. Null if unknown.
|
|
* @type int
|
|
*/
|
|
public $col;
|
|
|
|
/**
|
|
* Lookup array of processing that this token is exempt from.
|
|
* Currently, valid values are "ValidateAttributes".
|
|
* @type array
|
|
*/
|
|
public $armor = array();
|
|
|
|
/**
|
|
* When true, this node should be ignored as non-existent.
|
|
*
|
|
* Who is responsible for ignoring dead nodes? FixNesting is
|
|
* responsible for removing them before passing on to child
|
|
* validators.
|
|
*/
|
|
public $dead = false;
|
|
|
|
/**
|
|
* Returns a pair of start and end tokens, where the end token
|
|
* is null if it is not necessary. Does not include children.
|
|
* @type array
|
|
*/
|
|
abstract public function toTokenPair();
|
|
}
|
|
|
|
// vim: et sw=4 sts=4
|