In this new post on the SitePoint PHP Blog today, Harry looks at why it's "living dangerously" to use PHP with UTF-8.
This follows on from (unfinished) stuff here on charsets (tending towards UTF-8), which should help explain some of this.
Quick one—knocked up a list of "dangerous" functions and functionality in PHP, in relation to the use of UTF-8, available at http://www.phpwact.org/php/i18n/utf-8. These are for a "default" PHP setup without the mbstring overloading or PHP6 (where charset problems "magically vanish" ;) ).
He also notes that you can't rely on mbstring to be there, so he offers an alternative...