* Define HTMLPurifier_Lexer::$_entity_parser property
This fixes a PHP 8.2 deprecation.
* Define HTMLPurifier_URIFilterHarness::$filter property
This fixes a PHP 8.2 deprecation.
* Define HTMLPurifier_AttrTransform_NameSync::$idDef property
This fixes a PHP 8.2 deprecation.
* Define HTMLPurifier_AttrTransform_NameSyncTest::$accumulator property
This fixes a PHP 8.2 deprecation.
* Define HTMLPurifier_AttrValidator_ErrorsTest::$language property
This fixes a PHP 8.2 deprecation.
* Define HTMLPurifier_ChildDef_List::$whitespace property
This fixes a PHP 8.2 deprecation.
* Do not modify incoming tokens in RemoveSpansWithoutAttributes
Previously the undefined property `->markForDeletion` was added to the incoming
tokens. This causes a deprecation in PHP 8.2. Fix this by storing to-be-deleted
tokens inside SplObjectStorage. In PHP 8 a WeakMap would be preferable, as that
prevents leaks if `handleEnd` is never called for the token.
This mega-patch rips out the FixNesting implementation and the related
ChildDef components. The primary algorithmic change is to convert from
use of tokens to tree nodes, which are far more amenable to the style
of processing that FixNesting uses. Additionally, FixNesting has been
changed to go bottom-up rather than top-down, in order to avoid needing
to implement backtracking.
This patch simplifies a good deal of the relevant logic, since we no
longer need to continually recalculate the nesting structure when
processing things. However, the conversion to the alternate format
incurs some overhead, so for small inputs these changes are not a win.
One possibility to greatly reduce the constant factors here is to switch
to entirely using libxml's representation, and never serializing tokens;
this would require one to rewrite injectors, however.
The iterative post-order traversal in FixNesting is a bit subtle, but
we have essentially reified the stack and continuations.
We've removed support for %Core.EscapeInvalidChildren.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>