On his blog today, Chris Shiflett has two posts about a problem with Google and a Cross-site Scripting attack that it's vulnerable to.
From this post:
The recent cross-site scripting (XSS) vulnerability discovered in Google perfectly illustrates why character encoding matters. This example demonstrates how to use PHP's htmlentities() with the optional third argument that indicates the character encoding.
By way of demonstration, he provides a little PHP script that makes a request in a different character encoding than Google can handle. Coupled with the small response from Google, a UTF-7 character sent to certain browsers could be interpreted and executed.
In this second post, he answers a question from the comments - "how will this effect my site?"
Rather than offer another vague answer, I decided to provide a very simple proof of concept that demonstrates how character encoding inconsistencies can bite you. Google's vulnerability has of course been fixed, but with a simple PHP script, we can reproduce the situation.
The script, though escaped, still causes a Javascript popup box to show when the page is loaded - all due to a lack of improper character encoding handling...