0
0
mirror of https://github.com/ezyang/htmlpurifier.git synced 2024-12-22 08:21:52 +00:00

Add a nod to the RFC's recommendation that UTF-8 be used in URIs.

Mentioned in http://unspecified.wordpress.com/2008/06/29/do-browsers-encode-urls-correctly/

Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
This commit is contained in:
Edward Z. Yang 2008-10-31 12:55:07 -04:00
parent 0e6e2c4edf
commit 3fd51d527c

View File

@ -589,8 +589,10 @@ looks something like: <code>%C3%86</code>. There is no official way of
determining the character encoding of such a request, since the percent
encoding operates on a byte level, so it is usually assumed that it
is the same as the encoding the page containing the form was submitted
in. You'll run into very few problems if you only use characters in
the character encoding you chose.</p>
in. (<a href="http://tools.ietf.org/html/rfc3986#section-2.5">RFC 3986</a>
recommends that textual identifiers be translated to UTF-8; however, browser
compliance is spotty.) You'll run into very few problems
if you only use characters in the character encoding you chose.</p>
<p>However, once you start adding characters outside of your encoding
(and this is a lot more common than you may think: take curly