0
0
mirror of https://github.com/ezyang/htmlpurifier.git synced 2024-12-23 00:41:52 +00:00
2.1. The main parts of URLs

Scheme names consist of a sequence of characters. The lower case
   letters "a"--"z", digits, and the characters plus ("+"), period
   ("."), and hyphen ("-") are allowed. For resiliency, programs
   interpreting URLs should treat upper case letters as equivalent to
   lower case in scheme names (e.g., allow "HTTP" as well as "http").
This commit is contained in:
Michael Gusev 2013-01-16 15:14:36 +03:00
parent 344e0640b6
commit df3a3bab6e

View File

@ -30,7 +30,7 @@ class HTMLPurifier_URIParser
// Note that ["<>] are an addition to the RFC's recommended // Note that ["<>] are an addition to the RFC's recommended
// characters, because they represent external delimeters. // characters, because they represent external delimeters.
$r_URI = '!'. $r_URI = '!'.
'(([^:/?#"<>]+):)?'. // 2. Scheme '(([a-zA-Z0-9\.\+\-]+):)?'. // 2. Scheme
'(//([^/?#"<>]*))?'. // 4. Authority '(//([^/?#"<>]*))?'. // 4. Authority
'([^?#"<>]*)'. // 5. Path '([^?#"<>]*)'. // 5. Path
'(\?([^#"<>]*))?'. // 7. Query '(\?([^#"<>]*))?'. // 7. Query