From df3a3bab6eceaddd8b0f78938507890942ac4af9 Mon Sep 17 00:00:00 2001 From: Michael Gusev Date: Wed, 16 Jan 2013 15:14:36 +0300 Subject: [PATCH] RFC 1738 2.1. The main parts of URLs Scheme names consist of a sequence of characters. The lower case letters "a"--"z", digits, and the characters plus ("+"), period ("."), and hyphen ("-") are allowed. For resiliency, programs interpreting URLs should treat upper case letters as equivalent to lower case in scheme names (e.g., allow "HTTP" as well as "http"). --- library/HTMLPurifier/URIParser.php | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/library/HTMLPurifier/URIParser.php b/library/HTMLPurifier/URIParser.php index 7179e4ab..a7e5dd66 100644 --- a/library/HTMLPurifier/URIParser.php +++ b/library/HTMLPurifier/URIParser.php @@ -30,7 +30,7 @@ class HTMLPurifier_URIParser // Note that ["<>] are an addition to the RFC's recommended // characters, because they represent external delimeters. $r_URI = '!'. - '(([^:/?#"<>]+):)?'. // 2. Scheme + '(([a-zA-Z0-9\.\+\-]+):)?'. // 2. Scheme '(//([^/?#"<>]*))?'. // 4. Authority '([^?#"<>]*)'. // 5. Path '(\?([^#"<>]*))?'. // 7. Query