Kevin Schroeder has posted another excerpt from his "You Want to Do WHAT with PHP?" book to his blog today. This time it's from the third chapter that looks at character encodings like UTF-8 or ISO-8859-1.
I realized that while this 3.5-year PHP consultant knew Unicode, UTF-8, character encodings such as ISO-8859-1 or ISO-8859-7, I didn't understand them as well as I thought I had. With that I threw this chapter in the book. Knowing about character encoding is what many developers have. Not as many truly understand it. In this chapter I try to de-mystify character encoding as a whole.
The excerpt introduces character encoding and what it really is - a translation for the computer to be able to handle the human language. The problem comes in when multiple tools try to define the same sort of letters/chatacters in different ways. He gives an example of a "hello world" string in a normal ASCII format versus one from the EBCDIC format and how it would be rendered by an ASCII-understanding browser.