News Feed
Sections




News Archive
Looking for more information on how to do PHP the right way? Check out PHP: The Right Way

Zend Developer Zone:
PHP DOM XML extension encoding processing
September 02, 2009 @ 09:48:18

On the Zend Developer Zone today Alexander Veremyev shares some helpful hints he discovered about the DOM XML extension for PHP that could come in handy when working with different character encodings.

I recently worked with PHP's DOM XML extension while working on Zend Framework's Zend_Search_Lucene HTML highlighting capabilities, and uncovered some undocumented features and issues with the extension in regards to character encoding. The information contained in this article should also apply to other libxml-based DOM implementations, as PHP's DOM extension simply wraps that library.

There's five different tips he shares:

  • Internal document encoding is always UTF-8
  • Input data is always treated as UTF-8
  • Text nodes and CDATA are stored as UTF-8 without transformations
  • Document encoding does not affect loading behavior
  • Save/dumping operations and encoding

He describes each of the points and includes some sample code and XML to parse to help illustrate each.

0 comments voice your opinion now!
tutorial dom extension character encoding


blog comments powered by Disqus

Similar Posts

Rob Allen: Injecting dependencies into your ZF2 controllers

SitePoint PHP Blog: PHP and WMI Dig deep into Windows with PHP

WoorkUp.com: How-To Create Your Own Instant Search

Brian Swan's Blog: Performance Tuning PHP Apps on Windows/IIS with Output Caching

Codewalkers.com: New Tutorial - Coding \"Best Practices\" - or at least \"Better Practices\"


Community Events

Don't see your event here?
Let us know!


opinion unittest framework language interview release video series laravel laravel5 development voicesoftheelephpant community podcast api extension library conference introduction psr7

All content copyright, 2015 PHPDeveloper.org :: info@phpdeveloper.org - Powered by the Solar PHP Framework