mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2025-01-06 22:41:54 +00:00
77b60a4206
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
154 lines
7.1 KiB
HTML
154 lines
7.1 KiB
HTML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
|
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
|
|
<meta name="description" content="Explains how to safely allow the embedding of flash from trusted sites in HTML Purifier." />
|
|
<link rel="stylesheet" type="text/css" href="./style.css" />
|
|
|
|
<title>Embedding YouTube Videos - HTML Purifier</title>
|
|
|
|
</head><body>
|
|
|
|
<h1 class="subtitled">Embedding YouTube Videos</h1>
|
|
<div class="subtitle">...as well as other dangerous active content</div>
|
|
|
|
<div id="filing">Filed under End-User</div>
|
|
<div id="index">Return to the <a href="index.html">index</a>.</div>
|
|
<div id="home"><a href="http://htmlpurifier.org/">HTML Purifier</a> End-User Documentation</div>
|
|
|
|
<p>Clients like their YouTube videos. It gives them a warm fuzzy feeling when
|
|
they see a neat little embedded video player on their websites that can play
|
|
the latest clips from their documentary "Fido and the Bones of Spring".
|
|
All joking aside, the ability to embed YouTube videos or other active
|
|
content in their pages is something that a lot of people like.</p>
|
|
|
|
<p>This is a <em>bad</em> idea. The moment you embed anything untrusted,
|
|
you will definitely be slammed by a manner of nasties that can be
|
|
embedded in things from your run of the mill Flash movie to
|
|
<a href="http://blog.spywareguide.com/2006/12/myspace_phish_attack_leads_use.html">Quicktime movies</a>.
|
|
Even <code>img</code> tags, which HTML Purifier allows by default, can be
|
|
dangerous. Be distrustful of anything that tells a browser to load content
|
|
from another website automatically.</p>
|
|
|
|
<p>Luckily for us, however, whitelisting saves the day. Sure, letting users
|
|
include any old random flash file could be dangerous, but if it's
|
|
from a specific website, it probably is okay. If no amount of pleading will
|
|
convince the people upstairs that they should just settle with just linking
|
|
to their movies, you may find this technique very useful.</p>
|
|
|
|
<h2>Looking in</h2>
|
|
|
|
<p>Below is custom code that allows users to embed
|
|
YouTube videos. This is not favoritism: this trick can easily be adapted for
|
|
other forms of embeddable content.</p>
|
|
|
|
<p>Usually, websites like YouTube give us boilerplate code that you can insert
|
|
into your documents. YouTube's code goes like this:</p>
|
|
|
|
<pre>
|
|
<object width="425" height="350">
|
|
<param name="movie" value="http://www.youtube.com/v/AyPzM5WK8ys" />
|
|
<param name="wmode" value="transparent" />
|
|
<embed src="http://www.youtube.com/v/AyPzM5WK8ys"
|
|
type="application/x-shockwave-flash"
|
|
wmode="transparent" width="425" height="350" />
|
|
</object>
|
|
</pre>
|
|
|
|
<p>There are two things to note about this code:</p>
|
|
|
|
<ol>
|
|
<li><code><embed></code> is not recognized by W3C, so if you want
|
|
standards-compliant code, you'll have to get rid of it.</li>
|
|
<li>The code is exactly the same for all instances, except for the
|
|
identifier <tt>AyPzM5WK8ys</tt> which tells us which movie file
|
|
to retrieve.</li>
|
|
</ol>
|
|
|
|
<p>What point 2 means is that if we have code like <code><span
|
|
class="youtube-embed">AyPzM5WK8ys</span></code> your
|
|
application can reconstruct the full object from this small snippet that
|
|
passes through HTML Purifier <em>unharmed</em>.
|
|
<a href="http://repo.or.cz/w/htmlpurifier.git?a=blob;hb=HEAD;f=library/HTMLPurifier/Filter/YouTube.php">Show me the code!</a></p>
|
|
|
|
<p>And the corresponding usage:</p>
|
|
|
|
<pre><?php
|
|
$config->set('Filter.YouTube', true);
|
|
?></pre>
|
|
|
|
<p>There is a bit going in the two code snippets, so let's explain.</p>
|
|
|
|
<ol>
|
|
<li>This is a Filter object, which intercepts the HTML that is
|
|
coming into and out of the purifier. You can add as many
|
|
filter objects as you like. <code>preFilter()</code>
|
|
processes the code before it gets purified, and <code>postFilter()</code>
|
|
processes the code afterwards. So, we'll use <code>preFilter()</code> to
|
|
replace the object tag with a <code>span</code>, and <code>postFilter()</code>
|
|
to restore it.</li>
|
|
<li>The first preg_replace call replaces any YouTube code users may have
|
|
embedded into the benign span tag. Span is used because it is inline,
|
|
and objects are inline too. We are very careful to be extremely
|
|
restrictive on what goes inside the span tag, as if an errant code
|
|
gets in there it could get messy.</li>
|
|
<li>The HTML is then purified as usual.</li>
|
|
<li>Then, another preg_replace replaces the span tag with a fully fledged
|
|
object. Note that the embed is removed, and, in its place, a data
|
|
attribute was added to the object. This makes the tag standards
|
|
compliant! It also breaks Internet Explorer, so we add in a bit of
|
|
conditional comments with the old embed code to make it work again.
|
|
It's all quite convoluted but works.</li>
|
|
</ol>
|
|
|
|
<h2>Warning</h2>
|
|
|
|
<p>There are a number of possible problems with the code above, depending
|
|
on how you look at it.</p>
|
|
|
|
<h3>Cannot change width and height</h3>
|
|
|
|
<p>The width and height of the final YouTube movie cannot be adjusted. This
|
|
is because I am lazy. If you really insist on letting users change the size
|
|
of the movie, what you need to do is package up the attributes inside the
|
|
span tag (along with the movie ID). It gets complicated though: a malicious
|
|
user can specify an outrageously large height and width and attempt to crash
|
|
the user's operating system/browser. You need to either cap it by limiting
|
|
the amount of digits allowed in the regex or using a callback to check the
|
|
number.</p>
|
|
|
|
<h3>Trusts media's host's security</h3>
|
|
|
|
<p>By allowing this code onto our website, we are trusting that YouTube has
|
|
tech-savvy enough people not to allow their users to inject malicious
|
|
code into the Flash files. An exploit on YouTube means an exploit on your
|
|
site. Even though YouTube is run by the reputable Google, it
|
|
<a href="http://ha.ckers.org/blog/20061213/google-xss-vuln/">doesn't</a>
|
|
mean they are
|
|
<a href="http://ha.ckers.org/blog/20061208/xss-in-googles-orkut/">invulnerable.</a>
|
|
You're putting a certain measure of the job on an external provider (just as
|
|
you have by entrusting your user input to HTML Purifier), and
|
|
it is important that you are cognizant of the risk.</p>
|
|
|
|
<h3>Poorly written adaptations compromise security</h3>
|
|
|
|
<p>This should go without saying, but if you're going to adapt this code
|
|
for Google Video or the like, make sure you do it <em>right</em>. It's
|
|
extremely easy to allow a character too many in <code>postFilter()</code> and
|
|
suddenly you're introducing XSS into HTML Purifier's XSS free output. HTML
|
|
Purifier may be well written, but it cannot guard against vulnerabilities
|
|
introduced after it has finished.</p>
|
|
|
|
<h2>Help out!</h2>
|
|
|
|
<p>If you write a filter for your favorite video destination (or anything
|
|
like that, for that matter), send it over and it might get included
|
|
with the core!</p>
|
|
|
|
</body>
|
|
</html>
|
|
|
|
<!-- vim: et sw=4 sts=4
|
|
-->
|