Looking for more information on how to do PHP the right way? Check out PHP: The Right Way

Project:
Patchwork-UTF8 - UTF8 Support for PHP
Jan 27, 2012 @ 17:38:40

Nicolas Grekas has shared another tool that he's pulled out of his "Patchwork" framework to make it a stand-alone tool: the Patchwork-UTF8 helper that provides matching functions to those PHP already has for regular strings, but a little smarter to work with UTF8 correctly.

The PatchworkUtf8 class implements the quasi complete set of string functions that need UTF-8 grapheme clusters awareness. These functions are all static methods of the PatchworkUtf8 class. The best way to use them is to add a use PatchworkUtf8 as u; at the beginning of your files, then when UTF-8 awareness is required, prefix by u:: when calling them.

In the README for the tool he talks about the functions included in the current release that match PHP's string functions as well as some additional methods like "isUtf8", "bestFit" and "strtocasefold". It relies on the mbstring, iconv and intl extensions being installed, and if they aren't, it falls back to other functionality (list of those methods included).

tagged: utf8 support string patchwork framework helper mbstring iconv intl

Link:

Ahmed Shreef's Blog:
iconv misunderstands UTF-16 strings with no BOM
Aug 27, 2010 @ 18:36:56

Ahmed Shreef has a recent post to his blog about an issue he had converting UTF-16 strings over to UTF-8 with the iconv functionality in PHP. Specifically, he ended up with "rubbish unreadable characters" after the conversion.

I had a problem last week with converting UTF-16 encoded strings to UTF-8 using PHP's iconv library on a Linux server. my code worked fine on my machine but the same code resulted in a rubbish unreadable characters on our production server.

In his example (a basic "Hello World" in Arabic) he notes that there's no byte order mark on the string and, because of this, the iconv feature tries to guess if it's big-endian or little-endian. This guessing varies from machine to machine resulting in the inconsistencies he saw. The solution is to define the "to" and "from" for the conversion manually rather than letting it just guess.

tagged: byteordermark bom iconv utf16 utf8 convert

Link:

Yannick's Blog:
mbstring vs iconv benchmarking
Oct 06, 2008 @ 17:50:20

Recently on his blog Yannick has done some benchmarking comparing mbstring and iconv in PHP 5.2.4 release.

Following up on my previous post about the differences between the mbstring and iconv international characters libraries (which resulted in a tentative conclusion that nobody knew anything about those differences), and particularly the comments by Nicola, we have combined forces (mostly efforts from Nicola, actually) to provide you with a little benchmarking, if that can help you decide.

His code for the test script is included (for you to gather your own results) and a full listing of his results comparing the effects of possible caching, running up to ten executions. You can download the text file that he ran the script on here.

tagged: mbstring iconv benchmark php5 text file statistic

Link:

Dokeos Blog:
mbstring vs iconv
Apr 24, 2008 @ 16:18:08

In this post on the Dokeos blog, there's a comparison of the mbstring function and the iconv library as it pertains to their use on multi-byte strings.

I was wondering today why use mbstring rather than iconv in Dokeos, and honestly I didn't remember exactly why I had chosen mbstring in the past, but finding information about the *differences* between the two. [...] Searching a bit more, I found a PPT presentation from Carlos Hoyos on Google.

Essentially, it boils down to how the library is integrated - mbstring is bundled and iconv is pulled from an external source. So, if you're looking for maximum portability, he recommends mbstring.

tagged: mbstring iconv multibyte character string compare internal external

Link:

Riff Blog:
Console encoding in PHP-GTK apps
Nov 20, 2006 @ 16:58:00

PHP-GTKers working in English-related applications, don't have a problem with debugging messages output to a console when debugging, but applications on a more international front have issues with their output. But help has been found in this new post on the Riff Blog - a method for correctly encoding PHP-GTK applications.

PHP scripts are typically stored under UTF-8 encoding to limit i18n headaches, while the console in which their output will be displayed is normally configured to some regional encoding, like IBM850 in Windows/XP French.

So we need a workaround...

He splits the process out into a few steps, each with its own explanation and code:

  • Builtin tools
  • Buffering
  • Flushing
  • PHP-GTK is not PHP for the Web
  • Auto-flushing
All wrapped up with a final solution - using the iconv functionality in combination with some output buffering to correctly display the message.

tagged: phpgtk encoding output console i18n buffering iconv phpgtk encoding output console i18n buffering iconv

Link:

Riff Blog:
Console encoding in PHP-GTK apps
Nov 20, 2006 @ 16:58:00

PHP-GTKers working in English-related applications, don't have a problem with debugging messages output to a console when debugging, but applications on a more international front have issues with their output. But help has been found in this new post on the Riff Blog - a method for correctly encoding PHP-GTK applications.

PHP scripts are typically stored under UTF-8 encoding to limit i18n headaches, while the console in which their output will be displayed is normally configured to some regional encoding, like IBM850 in Windows/XP French.

So we need a workaround...

He splits the process out into a few steps, each with its own explanation and code:

  • Builtin tools
  • Buffering
  • Flushing
  • PHP-GTK is not PHP for the Web
  • Auto-flushing
All wrapped up with a final solution - using the iconv functionality in combination with some output buffering to correctly display the message.

tagged: phpgtk encoding output console i18n buffering iconv phpgtk encoding output console i18n buffering iconv

Link:

Christian Stocker's Blog:
PHP 5, OS X, fink and iconv
Jan 06, 2006 @ 13:23:47

Christian Stocker has a quick new post with a solution for those Mac users out there that would like to use the iconv extension with fink.

If you want to get the iconv extension properly running with PHP 5 and fink on OS X, you need the following configure option

--with-iconv=/sw/

and then it should work.

Hope that helps others, too.

And, apparently, it does - given the one comment below it so far that has a positive response...

tagged: fink iconv extension 5 OS X fink iconv extension 5 OS X

Link:

Christian Stocker's Blog:
PHP 5, OS X, fink and iconv
Jan 06, 2006 @ 13:23:47

Christian Stocker has a quick new post with a solution for those Mac users out there that would like to use the iconv extension with fink.

If you want to get the iconv extension properly running with PHP 5 and fink on OS X, you need the following configure option

--with-iconv=/sw/

and then it should work.

Hope that helps others, too.

And, apparently, it does - given the one comment below it so far that has a positive response...

tagged: fink iconv extension 5 OS X fink iconv extension 5 OS X

Link:


Trending Topics: