mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2025-01-10 16:01:53 +00:00
4ee1bf94e3
- Add TODO request about Phalanger, something to do if I'm really bored - Update XSS attacks - Minor formatting/grammar fixes in documentation git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@502 48356398-32a2-884e-a903-53898d9a118a
41 lines
2.0 KiB
Plaintext
41 lines
2.0 KiB
Plaintext
|
|
SLOW
|
|
also known as the HELP ME LIBRARY IS TOO SLOW MY PAGE TAKE TOO LONG LOAD page
|
|
|
|
HTML Purifier is a very powerful library. But with power comes great
|
|
responsibility, or, at least, longer execution times. Remember, this
|
|
library isn't lightly grazing over submitted HTML: it's deconstructing
|
|
the whole thing, rigorously checking the parts, and then putting it
|
|
back together.
|
|
|
|
So, if it so turns out that HTML Purifier is kinda too slow for outbound
|
|
filtering, you've got a few options:
|
|
|
|
1. Inbound filtering - perform filtering of HTML when it's submitted by the
|
|
user. Since the user is already submitting something, an extra half a
|
|
second tacked on to the load time probably isn't going to be that huge of
|
|
a problem. Then, displaying the content is a simple a manner of outputting
|
|
it directly from your database/filesystem. The trouble with this method is
|
|
that your user loses the original text, and when doing edits, will be
|
|
handling the filtered text. While this may be a good thing, especially if
|
|
you're using a WYSIWYG editor, it can also result in data-loss if a user
|
|
makes a typo.
|
|
|
|
2. Caching the filtered output - accept the submitted text and put it
|
|
unaltered into the database, but then also generate a filtered version and
|
|
stash that in the database. Serve the filtered version to readers, and the
|
|
unaltered version to editors. If need be, you can invalidate the cache and
|
|
have the cached filtered version be regenerated on the first page view. Pros?
|
|
Full data retention. Cons? It's more complicated, and opens other editors
|
|
up to XSS if they are using a WYSIWYG editor (to fix that, they'd have to
|
|
be able to get their hands on the *really* original text served in plaintext
|
|
mode).
|
|
|
|
In short, inbound filtering is almost as simple as outbound filtering, but
|
|
it has some drawbacks which cannot be fixed unless you save both the original
|
|
and the filtered versions.
|
|
|
|
There is a third option: profile and optimize HTMLPurifier yourself. Be sure
|
|
to report back your results if you decide to do that! Especially if you
|
|
port HTML Purifier to C++. ;-)
|