diff --git a/NEWS b/NEWS index 10794b77..9ac40031 100644 --- a/NEWS +++ b/NEWS @@ -17,6 +17,8 @@ NEWS ( CHANGELOG and HISTORY ) HTMLPurifier ! HTMLPurifier object now accepts configuration arrays, no need to manually instantiate a configuration object ! Context object now accessible to outside +! Added enduser-youtube.html, explains how to embed YouTube videos. See + also corresponding smoketest preserveYouTube.php. - printDefinition.php: added labels, added better clarification . HTMLPurifier_Config::create() added, takes mixed variable and converts into a HTMLPurifier_Config object. diff --git a/docs/enduser-youtube.html b/docs/enduser-youtube.html new file mode 100644 index 00000000..a237971a --- /dev/null +++ b/docs/enduser-youtube.html @@ -0,0 +1,174 @@ + + +
+ + + + +Clients like their YouTube videos. It gives them a warm fuzzy feeling when +they see a neat little embedded video player on their websites that can play +the latest clips from their documentary "Fido and the Bones of Spring". +All joking aside, the ability to embed YouTube videos or other active +content in their pages is something that a lot of people like.
+ +This is a bad idea. The moment you embed anything untrusted, +you will definitely be slammed by a manner of nasties that can be +embedded in things from your run of the mill Flash movie to +Quicktime movies. +Allowing users to tell the browser to load content from other websites +is intrinsically dangerous: there already security risks associated with +letting users include images from other sites!
+ +Luckily for us, however, whitelisting saves the day. Sure, letting users +include any old random flash file could be dangerous, but if it's +from a specific website, it probably is okay. If no amount of pleading will +convince the people upstairs that they should just settle with just linking +to their movies, you may find this technique very useful.
+ +Below is custom code that allows users to embed +YouTube videos. This is not favoritism: this trick can easily be adapted for +other forms of embeddable content.
+ +Usually, websites like YouTube give us boilerplate code that you can insert +into your documents. YouTube's code goes like this:
+ ++<object width="425" height="350"> + <param name="movie" value="http://www.youtube.com/v/AyPzM5WK8ys" /> + <param name="wmode" value="transparent" /> + <embed src="http://www.youtube.com/v/AyPzM5WK8ys" + type="application/x-shockwave-flash" + wmode="transparent" width="425" height="350" /> +</object> ++ +
There are two things to note about this code:
+ +<embed>
is not recognized by W3C, so if you want
+ standards-compliant code, you'll have to get rid of it.What point 2 means is that if we have code like <span
+class="embed-youtube">AyPzM5WK8ys</span>
your
+application can reconstruct the full object from this small snippet that
+passes through HTML Purifier unharmed.
+<?php + +class HTMLPurifierX_PreserveYouTube extends HTMLPurifier +{ + function purify($html, $config = null) { + $pre_regex = '#<object[^>]+>.+?'. + 'http://www.youtube.com/v/([A-Za-z0-9]+).+?</object>#'; + $pre_replace = '<span class="youtube-embed">\1</span>'; + $html = preg_replace($pre_regex, $pre_replace, $html); + $html = parent::purify($html, $config); + $post_regex = '#<span class="youtube-embed">([A-Za-z0-9]+)</span>#'; + $post_replace = '<object width="425" height="350" '. + 'data="http://www.youtube.com/v/\1">'. + '<param name="movie" value="http://www.youtube.com/v/\1"></param>'. + '<param name="wmode" value="transparent"></param>'. + '<!--[if IE]>'. + '<embed src="http://www.youtube.com/v/\1"'. + 'type="application/x-shockwave-flash"'. + 'wmode="transparent" width="425" height="350" />'. + '<![endif]-->'. + '</object>'; + $html = preg_replace($post_regex, $post_replace, $html); + return $html; + } +} + +$purifier = new HTMLPurifierX_PreserveYouTube(); +$html_still_with_youtube = $purifier->purify($html_with_youtube); + +?> ++ +
There is a bit going on here, so let's explain.
+ +HTMLPurifierX
because it's
+ userspace code. Don't use HTMLPurifier
in front of your
+ class, since it might clobber another class in the library.new HTMLPurifier
to new
+ HTMLPurifierX_PreserveYouTube
. There's other ways to go about
+ doing this: if you were calling a function that wrapped HTML Purifier,
+ you could paste the PHP right there. If you wanted to be really
+ fancy, you could make a decorator for HTMLPurifier.There are a number of possible problems with the code above, depending +on how you look at it.
+ +The width and height of the final YouTube movie cannot be adjusted. This +is because I am lazy. If you really insist on letting users change the size +of the movie, what you need to do is package up the attributes inside the +span tag (along with the movie ID). It gets complicated though: a malicious +user can specify an outrageously large height and width and attempt to crash +the user's operating system/browser. You need to either cap it by limiting +the amount of digits allowed in the regex or using a callback to check the +number.
+ +By allowing this code onto our website, we are trusting that YouTube has +tech-savvy enough people not to allow their users to inject malicious +code into the Flash files. An exploit on YouTube means an exploit on your +site, and when you start allowing shadier sites, remember that trust +is important.
+ +This should go without saying, but if you're going to adapt this code +for Google Video or the like, make sure you do it right. It's +extremely easy to allow a character too many in the final section and +suddenly you're introducing XSS into HTML Purifier's XSS free output. HTML +Purifier may be well written, but it cannot guard against vulnerabilities +introduced after it has finished.
+ +It would probably be a good idea if this code was added to the core +library. Look out for the inclusion of this into the core as a decorator +or the like.
+ + + \ No newline at end of file diff --git a/smoketests/preserveYouTube.php b/smoketests/preserveYouTube.php new file mode 100644 index 00000000..ef347b47 --- /dev/null +++ b/smoketests/preserveYouTube.php @@ -0,0 +1,65 @@ +'; +?> + + +Click here to see the unpurified version (breaks validation).
+ + +