Looking for more information on how to do PHP the right way? Check out PHP: The Right Way

SitePoint PHP Blog:
Turning a Crawled Website into a Search Engine with PHP
Jul 06, 2015 @ 15:19:43

The SitePoint PHP blog has posted the second part of their "Powerful Custom Search Engines with Diffbot" series with part two showing how to take the Diffbot results and make them searchable.

In the previous part of this tutorial, we used Diffbot to set up a crawljob which would eventually harvest SitePoint’s content into a data collection, fully searchable by Diffbot’s Search API. We also demonstrated those searching capabilities by applying some common filters and listing the results. [...] In this part, we’ll build a GUI simple enough for the average Joe to use it, in order to have a relatively pretty, functional, and lightweight but detailed SitePoint search engine. What’s more, we won’t be using a framework, but a mere total of three libraries to build the entire application.

For those interested in the end result, you can skip to the demo. Otherwise, they'll walk you through the full process:

  • Bootstrapping the environment and needed libraries
  • Creating a simple "home" page with a Diffbot client
  • Creating the frontend interface (a form allowing for various search terms)
  • Making the Javascript to catch the form submission
  • Adding CSS to style the page
  • Building out the PHP backend to perform the different search types (author and keywords)

Finally he ties it all together and create the output of the search results, providing links to each of the matching pages, posting date, author information and a brief summary. He ends the post with a look at paginating the results via a "PaginationHelper" class that will drop a navigation item at the bottom of the results and handle moving from page to page, interfacing with the Diffbot client.

tagged: search engine diffbot tutorial series part2 results crawled website

Link: http://www.sitepoint.com/turning-crawled-website-search-engine-php/


Trending Topics: