php|architect has officially released one of their latest guides - this time it's Matthew Turland's "Guide to Web Scraping".
Matthew talks a bit about it in his latest blog entry:
What Iâ€™m announcing in this blog post has been in the works since early 2008 when I first pitched the idea. It was rejected by several major publishers who basically said the same thing: the idea was in too small of a niche or simply wasnâ€™t marketable. php|architect Press respectfully disagreed with them and decided to publish what is now a book written by me that you can purchase.
The book covers all things related to pulling content from remote pages including an understanding of HTTP codes, a look at tools you can use (including cURL, pecl_http and Zend_Http_Client) and how to use technologies like DOM, SimpleXML and regular expressions to match content.