<?xml version="1.0"?>
<rss version="2.0">
  <channel>
    <title>PHPDeveloper.org</title>
    <link>http://www.phpdeveloper.org</link>
    <description>Up-to-the Minute PHP News, views and community</description>
    <language>en-us</language>
    <pubDate>Thu, 20 Jun 2013 04:39:49 -0500</pubDate>
    <ttl>30</ttl>
    <item>
      <title><![CDATA[Sameer Borate's Blog: Web scraping tutorial]]></title>
      <guid>http://www.phpdeveloper.org/news/12088</guid>
      <link>http://www.phpdeveloper.org/news/12088</link>
      <description><![CDATA[<p>
In <a href="http://www.codediesel.com/php/web-scraping-in-php-tutorial/">a new tutorial</a> on his blog today, <i>Sameer</i> shows a library that you can use (<a href="http://simplehtmldom.sourceforge.net/">simplehtmldom</a>) to parse remote sites and pull out just the information you need (aka "web scraping").
</p>
<blockquote>
There are three ways to access a website data. One is through a browser, the other is using a API (if the site provides one) and the last by parsing the web pages through code. The last one also known as Web Scraping is a technique of extracting information from websites using specially coded programs. In this post we will take a quick look at writing a simple scraper using the <a href="http://simplehtmldom.sourceforge.net/">simplehtmldom</a> library.
</blockquote>
<p>
His three (really more) step process guides yo through installing the library, installing Firebug and some example code to create your first scraper - an example that pulls some of the "Featured Links" from the Google search results sidebar. The second example illustrates grabbing the list of the table of contents from the most recent issue of <a href="http://wired.com">Wired</a>.
</p>]]></description>
      <pubDate>Mon, 09 Mar 2009 07:52:43 -0500</pubDate>
    </item>
  </channel>
</rss>
