Basically, browsers don't parse what should be valid URIs correctly, so
we have to go through some backbends to accomodate them. Specifically,
for browseable URIs, the following URIs have unintended behavior:
- ///example.com
- http:/example.com
- http:///example.com
Furthermore, if the path begins with //, modifying these URLs must
be done with care, as if you remove the host-name component, the
parse tree changes.
I've modified the engine to follow correct URI semantics as much
as possible while outputting browser compatible code, and invalidate
the URI in cases where we can't deal. There has been a refactoring
of URIScheme so that this important check is always performed,
introducing a new member variable allow_empty_host which is true
on data, file, mailto and news schemes.
This also fixes bypass bugs on URI.Munge.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
The new logic is as follows:
* Given a URL to insert into url(), check that it is properly URL
encoded (in particular, a doublequote and backslash never occurs
within it) and then place it as url("http://example.com").
* Given a font name, if it is strictly alphanumeric, it is safe to omit
quotes. Otherwise, wrap in double quotes and replace '"' with '\22 '
(note trailing space) and '\' with '\5C ' (ditto).
We introduce expandCSSEscape() which is a hack for common parsing
idioms in CSS; this means that CSS escapes are now recognized inside
URLs as well as unquoted font names.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
- URIFilter->prepare can return false in order to abort loading of the filter
- Implemented post URI filtering. Set member variable $post to true to set a URIFilter as such.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1772 48356398-32a2-884e-a903-53898d9a118a
- Improve parseCDATA algorithm to take into account newline normalization
- Fix regression in FontFamily validator. We now have a legit parser in place, albeit somewhat limited in use. Will be superseded by parser for entire grammar
- Convert EncoderTest to new format
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1769 48356398-32a2-884e-a903-53898d9a118a
- Change API for HTMLPurifier_AttrDef_CSS_Length
- Implement HTMLPurifier_AttrDef_Switch class
- Implement HTMLPurifier_Length->compareTo, and make make() accept object instances
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1754 48356398-32a2-884e-a903-53898d9a118a
- Removed HTMLPurifier_Error
- Documentation updates
- Removed more copy() methods in favor of clone
- HTMLPurifier::getInstance() to HTMLPurifier::instance()
- Fix InterchangeBuilder to use HTMLPURIFIER_PREFIX
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1689 48356398-32a2-884e-a903-53898d9a118a
! Experimental support for some proprietary CSS attributes allowed: opacity (and all of the browser-specific equivalents) and scrollbar colors. Enable by setting %CSS.Proprietary to true.
- Colors missing # but in hex form will be corrected
- CSS Number algorithm improved
. New classes:
+ HTMLPurifier_AttrDef_CSS_AlphaValue
+ HTMLPurifier_AttrDef_CSS_Filter
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1473 48356398-32a2-884e-a903-53898d9a118a
- Add extra protection in AttrDef_URI against phantom Schemes
- Doctype moved from config to HTMLDefinition
- AttrDef_URITest mocks have more generic object parameters to deal with PHP4's copy-happy behavior
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1089 48356398-32a2-884e-a903-53898d9a118a