On the Zend Developer Zone today there's a new tutorial posted that shows how to use the Zend_Search_Lucene component of the Zend Framework to create a stemming analyzer.
The Zend implementation of Lucene provides a powerful tool set for those looking to implement a Google-like search for their PHP web application. One of the requirements in creating a Google-like search with Zend is the creation of a stemming, stop word filtering, lower-casing analyzer. This article will briefly discuss the basic role of an analyzer in the Lucene API, my implementation of a new "StandardAnalyzer" for the Zend_Search_Lucene component of the Zend Framework, the inner workings of this analyzer, and its basic usage.
It talks about the creation of an analyzer - a tool that splits out words, removes some of the most common and standardizes the contents (like making it all lowercase such as the StandardAnalyzer in Java's Lucene does). The author has come up with his own implementation in PHP and works through it, explaining how it works and where to put the data and language files it would need to pull from.