mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2025-01-18 11:41:52 +00:00
a5751c7f20
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@542 48356398-32a2-884e-a903-53898d9a118a
60 lines
2.6 KiB
Plaintext
60 lines
2.6 KiB
Plaintext
|
|
Configuration Ideas
|
|
|
|
Here are some theoretical configuration ideas that we could implement some
|
|
time. Note the naming convention: %Namespace.Directive
|
|
|
|
%Attr.IDPrefix - prefix all ids with this
|
|
|
|
%Attr.RewriteFragments - if there's %Attr.IDPrefix we may want to transparently
|
|
rewrite the URLs we parse too. However, we can only do it when it's a pure
|
|
anchor link, so it's not foolproof
|
|
|
|
%Attr.ClassBlacklist,
|
|
%Attr.ClassWhitelist,
|
|
%Attr.ClassPolicy - determines what classes are allowed. When
|
|
%Attr.ClassPolicy is set to Blacklist, only allow those not in
|
|
%Attr.ClassBlacklist. When it's Whitelist, only allow those in
|
|
%Attr.ClassWhitelist.
|
|
|
|
%Attr.MaxWidth,
|
|
%Attr.MaxHeight - caps for width and height related checks.
|
|
(the hack in Pixels for an image crashing attack could be replaced by this)
|
|
|
|
%URI.Munge - will munge all external URIs to a different URI, which redirects
|
|
the user to the applicable page. A urlencoded version of the URI
|
|
will replace any instances of %s in the string. One possible
|
|
string is 'http://www.google.com/url?q=%s'. Useful for preventing
|
|
pagerank from being sent to other sites, but can also be used to
|
|
redirect to a splash page notifying user that they are leaving your
|
|
website.
|
|
|
|
%URI.AddRelNofollow - will add rel="nofollow" to all links, preventing the
|
|
spread of ill-gotten pagerank
|
|
|
|
%URI.RelativeToAbsolute - transforms all relative URIs to absolute form
|
|
|
|
%URI.HostBlacklist - strings that if found in the host of a URI are disallowed
|
|
%URI.HostBlacklistRegex - regexes that if matching the host are disallowed
|
|
%URI.HostWhitelist - domain names that are excluded from the host blacklist
|
|
%URI.HostPolicy - determines whether or not its reject all and then whitelist
|
|
or allow all in then do specific blacklists with whitelist intervening.
|
|
'DenyAll' or 'AllowAll' (default)
|
|
|
|
%URI.DisableIPHosts - URIs that have IP addresses for hosts are disallowed.
|
|
Be sure to also grab unusual encodings (dword, hex and octal), which may
|
|
be currently be caught by regular DNS
|
|
%URI.DisableIDN - Disallow raw internationalized domain names. Punycode
|
|
will still be permitted.
|
|
|
|
%URI.ConvertUnusualIPHosts - transform dword/hex/octal IP addresses to the
|
|
regular form
|
|
%URI.ConvertAbsoluteDNS - Remove extra dots after host names that trigger
|
|
absolute DNS. While this is actually the preferred method according to
|
|
the RFC, most people opt to use a relative domain name relative to . (root).
|
|
|
|
%URI.DisableExternalResources - disallow resource links (i.e. URIs that result
|
|
in immediate requests, such as src in IMG) to external websites
|
|
|
|
%HTML.DisableImg - disables all images
|