Is it safe to assume decoded percent-encoded URIs turn into UTF-8?

RFC 3986 states that new URI scheme should be encoded to UTF-8 first before being percent encoded. However, this does not apply to previous URI versions.

Is it safe to assume that all multibyte, percent encoded URI turns into UTF-8 encoded string after being passed through urldecode() ?

For example, if the contents of $_SERVER['REQUEST_URI'] is being percent encoded as such:




/b%C3%BCch/w%C3%B6rterb%C3%BCch

After I pass this string to urldecode() , I should have a multibyte string. But how do I know in what encoding the string is? In the above example, it's UTF-8, but is it safe to always assume so?

If it's not safe to assume so, is there a way (other than mb_detect_encoding ) to detect the encoding of the string? I've checked request headers, they don't seem to have anything helpful.

Source: Tips4all

Ccna final exam - java, php, javascript, ios, cshap all in one

Thursday, April 19, 2012

Is it safe to assume decoded percent-encoded URIs turn into UTF-8?

No comments:

Post a Comment

Total Pageviews