mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2025-01-23 13:51:54 +00:00
bc5871f389
- Various documentation updates - Fixed fatal error in benchmark scripts, slightly augmented - As far as possible, whitespace is preserved in-between table children - Configuration option to optionally Tidy up output for indentation to make up for dropped whitespace by DOMLex (pretty-printing for the entire application should be done by a page-wide Tidy) - Sample test-settings.php file included Unrelated unmerged edit: removed irrelevant 1.2.0 release notes, those only exist in the trunk. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/1.1@458 48356398-32a2-884e-a903-53898d9a118a
40 lines
2.0 KiB
Plaintext
40 lines
2.0 KiB
Plaintext
|
|
SLOW
|
|
also known as the HELP ME LIBRARY IS TOO SLOW MY PAGE TAKE TOO LONG LOAD page
|
|
|
|
HTMLPurifier is a very powerful library. But with power comes great
|
|
responsibility, or, at least, longer execution times. Remember, this
|
|
library isn't lightly grazing over submitted HTML: it's deconstructing
|
|
the whole thing, rigorously checking the parts, and then putting it
|
|
back together.
|
|
|
|
So, if it so turns out that HTMLPurifier is kinda too slow for outbound
|
|
filtering, you've got a few options:
|
|
|
|
1. Inbound filtering - perform filtering of HTML when it's submitted by the
|
|
user. Since the user is already submitting something, an extra half a
|
|
second tacked on to the load time probably isn't going to be that huge of
|
|
a problem. Then, displaying the content is a simple a manner of outputting
|
|
it directly from your database/filesystem. The trouble with this method is
|
|
that your user loses the original text, and when doing edits, will be
|
|
handling the filtered text. While this may be a good thing, especially if
|
|
you're using a WYSIWYG editor, it can also result in data-loss if a user
|
|
expects a certain to be available but it doesn't.
|
|
|
|
2. Caching the filtered output - accept the submitted text and put it
|
|
unaltered into the database, but then also generate a filtered version and
|
|
stash that in the database. Serve the filtered version to readers, and the
|
|
unaltered version to editors. If need be, you can invalidate the cache and
|
|
have the cached filtered version be regenerated on the first page view. Pros?
|
|
Full data retention. Cons? It's more complicated, and opens other editors
|
|
up to XSS if they are using a WYSIWYG editor (to fix that, they'd have to
|
|
be able to get their hands on the *really* original text served in plaintext
|
|
mode).
|
|
|
|
In short, inbound filtering is almost as simple as outbound filtering, but
|
|
it has some drawbacks which cannot be fixed unless you save both the original
|
|
and the filtered versions.
|
|
|
|
There is a third option: profile and optimize HTMLPurifier yourself. Be sure
|
|
to tell me if you decide to do that! ;-)
|