Looking for more information on how to do PHP the right way? Check out PHP: The Right Way

PHPMaster.com:
Working with Multibyte Strings
Jul 18, 2013 @ 15:12:55

On PHPMaster.com there's a tutorial posted that helps you understand how to work with multibyte strings in PHP. Multibyte strings could be a set of characters from a non-English language. They have to be treated differently than normal strings using the mbstring functionality.

A written language, whether it’s English, Japanese, or whatever else, consists of a number of characters, so an essential problem when working with a language digitally is to find a way to represent each character in a digital manner. Back in the day we only needed to represent English characters, but it’s a whole different ball game today and the result is a bewildering number of character encoding schemes used to represent the characters of many different languages. How does PHP relate to and deal with these different schemes?

He goes through a bit of introduction to multibyte strings - how they're represented internally, character schemes and Unicode. He also talks about the PHP support for the strings, noting that it's not really made to deal with them by default and the two methods you might use - iconv and mbstring. He shows how to enable the latter and introduces some of the most common functions you'll use with it (complete with some code examples).

tagged: multibyte strings tutorial mbstring introduction unicode

Link: http://phpmaster.com/working-with-multibyte-strings

Johannes Schluter's Blog:
Improvements for PHP application portability in PHP.next
Jul 26, 2011 @ 13:40:46

In a new post today Johannes Schluter talks about the upcoming version of PHP and three of the things it features: no more short tags, no more magic quotes and the dropping of the enable-zend-multibyte compile option.

I was writing about PHP.next before, many things improved there meanwhile. Most notably we have a committed version number: The next PHP release will be called PHP 5.4. The topic I want to talk about today is "Improved application portability" which covers multiple small changes which aim at making it simpler for developers to write applications working on any PHP setup.

The first two will be immediately familiar to any PHP developer, but the third might be a little more elusive. This option was used to compile in multi-byte encodings to use for data in an application. Unfortunately a good implementation (that didn't use mbstring) couldn't be found, so they're removing the feature.

tagged: version magicquotes shorttags enable zend multibyte configure

Link:

Vinu Thomas' Blog:
mbstring Functions by default in PHP
Jul 18, 2008 @ 12:57:16

In a new post to his blog, Vinu Thomas talks about a set of functions that can make your life easier when handling unicode strings - the mb_* methods of the mbstring extension.

When dealing with multiple languages and internalization in PHP, some of the default functions in PHP end up mangling up the unicode characters in PHP. This is evident when you have a lot of funny looking characters coming up on your web page instead of the actual characters. [...] There is an extensions called mbstring which you can install in PHP which gives you a set of functions which are unicode ( actually multibyte ) ready.

He mentions some of the replacements like mb_send_mail instead o fmail and mb_strlen instead of the usual strlen. Thankfully, there's a simple way to make use of these functions without having to replace a lot of code - a setting in your php.ini (mbstring.func_overload) that tells your application to seamlessly replace things behind the scenes.

tagged: mbstring function utf8 unicode multibyte replace

Link:

Mark Kimsal's Blog:
Addslashes(): don't call it a comeback
Jun 12, 2008 @ 18:36:20

As Michael Kimsal points out, there's a new posting on his brother Mark's blog talking about alternatives to addslashes() in your applications.

I've seen a lot of people talking about mysql_real_escape_string() vs addslashes() vs addcslashes(). There seems to be a lot of real confusion about what these functions do (even with the php.net manual around), especially when it comes to character sets. [...] So, I've decided to lay it all out in a few charts so there is no confusion about what each function does and how each can help protect against SQL injection attacks.

He ran some tests based on what the function does to see if it helps with certain things like "escapes with single quotes instead of backslash" and "prevents multi-byte attacks". He compares the speed and testability of the functions as well as provides a multi-byte breakdown oh how the mysql_real_escape_string function works with different character sets.

tagged: addslashes compare escape string mysql addcslashes multibyte

Link:

Dokeos Blog:
mbstring vs iconv
Apr 24, 2008 @ 16:18:08

In this post on the Dokeos blog, there's a comparison of the mbstring function and the iconv library as it pertains to their use on multi-byte strings.

I was wondering today why use mbstring rather than iconv in Dokeos, and honestly I didn't remember exactly why I had chosen mbstring in the past, but finding information about the *differences* between the two. [...] Searching a bit more, I found a PPT presentation from Carlos Hoyos on Google.

Essentially, it boils down to how the library is integrated - mbstring is bundled and iconv is pulled from an external source. So, if you're looking for maximum portability, he recommends mbstring.

tagged: mbstring iconv multibyte character string compare internal external

Link:

Elizabeth Smith's Blog:
String Class (Kal_String)
Feb 01, 2006 @ 12:48:47

On her blog today, Elizabeth Smith has this new post highlighting a string class that she's created to overload the basic PHP types to handle multibyte or translated strings.

So my rather cumbersome three classes to handle translation and charsets is now ONE class. When the rest of the magic __toString stuff goes into php (estimated for 5.2, which I wouldn't know if I didn't read internals religiously) it makes it even easier to use.

Kal_String is the class itself. Basically it has TWO constructors - because there are a series of static settings and two static methods that deal with things like a default charset to use for all strings and a default language to look for. The language searching is set up with a callback - so you can write your own class using gettext or including straight php files or whatever you want. You can even manually load in translation strings for individual string instances if you're so inclined.

She gives examples of how to use the class, everything from just a simple output to the use of some of the more advanced "interpretation"-based features.

tagged: string class simple output interpret multibyte overload string class simple output interpret multibyte overload

Link:

Elizabeth Smith's Blog:
String Class (Kal_String)
Feb 01, 2006 @ 12:48:47

On her blog today, Elizabeth Smith has this new post highlighting a string class that she's created to overload the basic PHP types to handle multibyte or translated strings.

So my rather cumbersome three classes to handle translation and charsets is now ONE class. When the rest of the magic __toString stuff goes into php (estimated for 5.2, which I wouldn't know if I didn't read internals religiously) it makes it even easier to use.

Kal_String is the class itself. Basically it has TWO constructors - because there are a series of static settings and two static methods that deal with things like a default charset to use for all strings and a default language to look for. The language searching is set up with a callback - so you can write your own class using gettext or including straight php files or whatever you want. You can even manually load in translation strings for individual string instances if you're so inclined.

She gives examples of how to use the class, everything from just a simple output to the use of some of the more advanced "interpretation"-based features.

tagged: string class simple output interpret multibyte overload string class simple output interpret multibyte overload

Link:


Trending Topics: