 | News Feed |
 | Jobs Feed |
Sections
|
| feed this: |  |
SitePoint PHP Blog: Character Encoding Issues with Cultural Integration
by Chris Cornutt September 10, 2008 @ 12:07:06
On the SitePoint PHP Blog Troels Knak-Nielsen points out some "cultural integration issues" he's seen when it comes to character encoding in his PHP applications.
The gold standard solution is to convert everything to utf-8. Since utf-8 covers the entire unicode range, it is capable of representing any character that latin1 can. Unfortunately, that's a lot easier to do from the outset, than with a big, running application. And even then, there may be third party code and extensions, which assume latin1. I'd much rather continue with latin1 being the default, and only jump through hoops at the few places where I actually need full utf-8 capacity.
He came up with a (relatively) simple solution - keep the information encoded in the latin1 he already has but serve up the pages with a utf-8 format, embedding utf-8 inside the latin1 when needed. He gives the code for both, making use of output buffering and the utf8 encoding functions to make it all work.
voice your opinion now!
character encoding cultural integration utf8 latin1 tutorial
Vinu Thomas' Blog: mbstring Functions by default in PHP
by Chris Cornutt July 18, 2008 @ 07:57:16
In a new post to his blog, Vinu Thomas talks about a set of functions that can make your life easier when handling unicode strings - the mb_* methods of the mbstring extension.
When dealing with multiple languages and internalization in PHP, some of the default functions in PHP end up mangling up the unicode characters in PHP. This is evident when you have a lot of funny looking characters coming up on your web page instead of the actual characters. [...] There is an extensions called mbstring which you can install in PHP which gives you a set of functions which are unicode ( actually multibyte ) ready.
He mentions some of the replacements like mb_send_mail instead o fmail and mb_strlen instead of the usual strlen. Thankfully, there's a simple way to make use of these functions without having to replace a lot of code - a setting in your php.ini (mbstring.func_overload) that tells your application to seamlessly replace things behind the scenes.
voice your opinion now!
mbstring function utf8 unicode multibyte replace
ThinkPHP Blog: Multilingual Websites with PHP
by Chris Cornutt July 15, 2008 @ 07:55:38
On the ThinkPHP blog, Florian Eibeck has posted an overview of some key things to consider when internationalizing your application/website.
The biggest problem is that most developers lack knowledge about Internationalisation, Localisation, Character encodings, Unicode and all those terms connected with multilingualism. The following article should give you a basic understanding and show you how to avoid those funny characters.
He defines a few terms - internationalization, ASCII, unicode and the UTF-8/ISO-8859 character sets. He mentions how to accept the utf-8 string into your application and how to use it in both PHP and store it in a MySQL database.
voice your opinion now!
multilingual website internationalization i18n utf8 unicode
PHPWACT.org: Handling UTF-8 with PHP
by Chris Cornutt January 24, 2008 @ 07:51:00
Ed Finkler has pointed out a handy resource for those trying to cope with using the UTF-8 support included in several of PHP's functions - this page on the Web Application Component Toolkit wiki.
This page is intended as a reference for functionality PHP provides which can either help with handling UTF-8 or should be regarded as a risk when used in conjunction with UTF-8 encoded strings. Further information can be found on the Internationalization (I18N) and Character Sets / Character Encoding Issues pages.
It talks about the "dangerous" functionality PHP has (issues that the language has in current functions) when using things like the PCRE extension, the string extension, the array methods, handling variables, the XML extensions (DOM and SAX), image manipulation, and URL parsing functionality.
voice your opinion now!
utf8 dangerous functionality pcre xml string array image url
Nessa's Blog: Convert Database to UTF-8
by Chris Cornutt December 13, 2007 @ 10:23:00
Nessa has posted a quick way to convert a database from whatever character set it's currently on over to UTF-8 with a handy PHP script.
When you're dealing with special characters in a database, you have to make sure that the charset and collation are dumped *with* the database, so that when you move it to another server the tables and data create properly. The biggest annoyance so far is converting tables back to UTF-8, as when this is done through the MySQL shell or phpmyadmin is had to be done table-by-table.
The script logs into the database and pulls all of the table information out (could be a lengthy list depending on the database) and runs an ALTER TABLE to change its character set to 'UTF8'.
voice your opinion now!
characterset utf8 convert alter database table characterset utf8 convert alter database table
Maggie Nelson's Blog: When PHP and Oracle assume the worst about each other
by Chris Cornutt June 13, 2007 @ 10:10:00
As mentioned by Ben Ramsey today, Maggie Nelson bumped into an issue in one of her applications with character sets and the incorrect storage/retrieval of information:
Even Oracle, which usually gets storing of data right on the money has had issues with character sets. [...] Needless to say, even when you *know* you set up your database correctly for supporting UTF8, the path to debug issues may be frustrating and full of red herrings.
She mentions the setup the application is using (NLS_CHARACTERSET AL32UTF8, NLS_NCHAR_CHARACTERSET AL16UTF16) but something wasn't right. The problem popped up when they tried to store Chinese characters into the database with the result of invalid data on a select.
After following several different leads, they finally came upon the culprit - the Apache process didn't have the access it needed to a directory in the ORACLE_HOME. In the end, it all only broke down into three easy steps to fix a very frustrating issue.
voice your opinion now!
oracle utf8 characterset nls unicode oracle utf8 characterset nls unicode
|
Community Events
Don't see your event here? Let us know!
|