Looking for more information on how to do PHP the right way? Check out PHP: The Right Way

Sergey Zhuk:
Fast Web Scraping With ReactPHP. Part 3: Using Proxy
Jun 26, 2018 @ 12:27:43

Sergey Zhuk has posted the third part of his series covering the use of ReactPHP to scrape content from another source on the web. In this third part of the series he improves on his scripts from before (scraping from the IMDB site) to add in a proxy server.

n the previous article, we have created a scraper to parse movies data from used a simple in-memory queue to avoid sending hundreds or thousands of concurrent requests and thus to avoid being blocked. But what if you are already blocked? The site that you are scraping has already added your IP to its blacklist and you don’t know whether it is a temporal block or a permanent one.

Such issued can be resolved with a proxy server. Using proxies and rotating IP addresses can prevent you from being detected as a scraper.

He then shows how to use the clue/reactphp-buzz package to write an asynchronous HTTP request to google.com making use of promises rather than normal synchronous request handling. He then installs the clue/reactphp-socks package to make the connection to the proxy server(s) and modifies the Buzz client to use that as a connection. After finding a proxy server to use, he updates the scraper code created previously with the new Buzz+Socks combination and shows it in action scraping data. The post finishes with a look at adding some error handling and how to handle when the proxy requests authentication before use.

tagged: web scraping tutorial series part3 reactphp buzz socks proxy server

Link: https://sergeyzhuk.me/2018/06/20/fast-webscraping-with-reactphp-proxy/

Sergey Zhuk:
Sending Email Asynchronously With ReactPHP Child Processes
May 04, 2018 @ 09:42:27

Sergey Zhuk has a new tutorial posted on his site showing you how to use child processes in ReactPHP to send emails asynchronously using Swiftmailer.

In PHP the most of libraries and native functions are blocking and thus they block an event-loop. For example, each time we make a database query with PDO, or check a file with file_exists() our asynchronous application is being blocked and waits. Things often become challenging when we want to integrate some synchronous code in an asynchronous application. This problem can be solved in two ways: rewrite a blocking code using a new non-blocking one or fork this blocking code and let it execute in a child process, while the main program continues running asynchronously.

This first approach is not always available, asynchronous PHP ecosystem is still small and not all use-cases have asynchronous implementations. So, in this article, we will cover the second approach.

He starts by creating the main HTTP server handler running locally on port 8080. He adds in an exception handler to catch potential issues and provides example code of an exception being thrown. With that structure in place he starts on the Swiftmailer integration, adding it to the exception handler and pushing the details of the exception into the message body. This is then modified to use the react/child-process package to wrap a new PHP file inside of a child process loop. The tutorial ends with an example of how to pass data between the parent and child process. In this case it's the message from the exception.

tagged: send email child process reactphp tutorial exception asynchronous

Link: http://sergeyzhuk.me/2018/05/04/reactphp-child-processes/

Sergey Zhuk:
Fast Web Scraping With ReactPHP. Part 2: Throttling Requests
Mar 19, 2018 @ 09:20:55

Sergey Zhuk has posted the second part of his "fast web scraping" series that makes use of the ReactPHP package to perform the requests. In part one he laid some of the groundwork for the scraper and made a few requests. In this second part he improves on this basic script and how to throttle the requests so as to not overload the end server.

t is very convenient to have a single HTTP client which can be used to send as many HTTP requests as you want concurrently. But at the same time, a bad scraper which performs hundreds of concurrent requests per second can impact the performance of the site being scraped. Since the scrapers don’t drive any human traffic on the site and just affect the performance, some sites don’t like them and try to block their access. The easiest way to prevent being blocked is to crawl nicely with auto throttling the scraping speed (limiting the number of concurrent requests). The faster you scrap, the worse it is for everybody. The scraper should look like a human and perform requests accordingly. A good solution for throttling requests is a simple queue.

He shows how to integrate the clue/mq-react package into the current scraper to interface with a RabbitMQ instance and handle the reading of and writing to the queue. He includes the code needed to update the ReactPHP client. The mq-react package makes the update simple with the HTTP client reading from the queue instance rather than the array of URLs. One the queue is integrated, he then shows how to create a "parser" that can read in the HTML and extract only the wanted data using the DomCrawler component.

tagged: http reactphp client scraping web tutorial throttle request queue imdb

Link: http://sergeyzhuk.me/2018/03/19/fast-webscraping-with-reactphp-limiting-requests/

Sergey Zhuk:
Amp Promises: Using Router With ReactPHP Http Component
Mar 13, 2018 @ 09:25:37

Sergey Zhuk has a post on his site that covers using a Router with a ReactPHP component. This router lets you more easily direct the HTTP requests coming into the application to the correct piece of functionality.

Router defines the way your application responds to a client request to a specific endpoint which is defined by URI (or path) and a specific HTTP request method (GET, POST, etc.). With ReactPHP Http component we can create an asynchronous web server. But out of the box the component doesn’t provide any routing, so you should use third-party libraries in case you want to create a web-server with a routing system.

He starts with an example of manual routing, showing the code for a basic server and adding in handlers based on the path+HTTP verb to respond with different content. He expands this basic example out to a more "real world" situation of the usual CRUD handling for "tasks". The post then shows how to change things up and use the FastRoute routing package to remove the manual route definitions from the server and define them in the router instead. It can then dispatch these to the correct location more easily. The post finishes up showing an additional feature: how to use wildcards in these URL definitions.

tagged: reactphp server http router fastroute tutorial series

Link: http://sergeyzhuk.me/2018/03/13/using-router-with-reactphp-http/

Sergey Zhuk:
Working With FileSystem In ReactPHP
Feb 28, 2018 @ 10:29:16

Sergey Zhuk has posted another ReactPHP tutorial to his site, this time focusing on working with the filesystem from a ReactPHP application.

I/O operations in the filesystem are often very slow, compared with CPU calculations. In an asynchronous PHP application this means that every time we access the filesystem even with a simple fopen() call, the event loop is being blocked. All other operations cannot be executed while we are reading or writing on the disk.

[...] So, what is the solution? ReactPHP ecosystem already has a component that allows you to work asynchronously with a filesystem: reactphp/filesystem. This component provides a promise-based interface for the most commonly used operations within a filesystem.

He starts the code with a bit of setup, creating the initial event loop, the related Filesystem instance and a pointer to a "test.txt" file. He then walks through the basic filesystem operations and the code required: reading in the file contents, creating a new file and writing content back out to a file. The next section goes through the same functionality for directories. He ends the post with a look at symbolic link creation, read and delete operations.

tagged: reactphp tutorial filesystem file directory symboliclink

Link: http://sergeyzhuk.me/2018/02/27/reactphp-filesystem/

Cees-Jan Kiewiet:
Smoke testing ReactPHP applications with Cigar
Feb 27, 2018 @ 10:47:31

In a new post to his site Cees-Jan Kiewiet covers a new library he discovered - Cigar - and how to use it for smoke testing a ReactPHP application. Smoke testing (or "sanity testing") is the evaluation of the major functionality of an application rather than individual pieces of code.

Last week I came across Cigar, a smoke testing tool by Matt Brunt. Which, to me, is great stepping stone for my personal projects/sites to integration tests. In this post we not only go into Cigar, but also how to start your HTTP ReactPHP application, run cigar against it, and shut it down again. (Note that it doesn't have to be a ReactPHP application it can also be a NodeJS app, or PHP's build in webserver you use for testing.)

He then walks through the process of installing Cigar and creating the initial configuration file of endpoints to test (along with expected statuses). He then shows how to automate things further and creates a bash script that starts the ReactPHP application, runs the tests then shuts the application down. It's a simple script but can help save a few keystrokes every time the tests are run.

tagged: smoketest cigar testing reactphp bash automation library

Link: https://blog.wyrihaximus.net/2018/02/smoke-testing-reactphp-applications-with-cigar/

Sergey Zhuk:
Fast Web Scraping With ReactPHP
Feb 12, 2018 @ 10:55:42

Sergey Zhuk has a new ReactPHP-related post to his site today showing you how to use the library to scrape content from the web quickly, making use of the asynchronous abilities the package provides.

Almost every PHP developer has ever parsed some data from the Web. Often we need some data, which is available only on some website and we want to pull this data and save it somewhere. It looks like we open a browser, walk through the links and copy data that we need. But the same thing can be automated via script. In this tutorial, I will show you the way how you can increase the speed of you parser making requests asynchronously.

In his example he creates a scraper that goes to a movie's page on the IMDB website and extracts the title, description, release date and the list of genres it falls into. Instead of creating a single-threaded process that can only fetch a single page at a time, he uses ReactPHP to speed things up and provide it a list of pages to fetch all at the same time. He starts by walking through the setup of the package and the creation of the browser instance. He then includes the code to make the request and crawl the contents of the result for the data. The post ends with the full code for the client and a way to add in a timeout in case the request fails.

tagged: scraping reactphp tutorial imdb movie crawl dom

Link: http://sergeyzhuk.me/2018/02/12/fast-webscraping-with-reactphp/

Cees-Jan Kiewiet:
ReactPHP with RecoilPHP: Creating for/http-middleware-psr15-adapter
Feb 09, 2018 @ 11:21:13

Cees-Jan Kiewiet is back with the latest tutorial in his series covering ReactPHP and RecoilPHP. In the previous parts he introduced some of the basic concepts and set up the first bits of code combining ReactPHP and RecoilPHP. In this latest tutorial (part three) he shows how to integrate this with a PSR-15 compliant middleware to evaluate response time.

There are more uses for coroutines than just making working with promises easier. In this post we're diving into the details on how they are used by the Friends of ReactPHP in the PSR-15 Middleware adapter for react/http.

When we started discussing how middleware for react/http should work we also look at the state of PSR-15 at the time. We decided against implementing it directly because of the fully blocking nature of PSR-15, in favour of callable. Which turned into an even better decision when return type hints where added to it to PSR-15. Now I love PSR-15, and middleware in general, which is why I created for/http-middleware-psr15-adapter to bridge the gap.

He starts with the code required to create a normal PSR-15 middleware and recreating the same functionality in a ReactPHP middleware. The article then shows how to use the package he developed to transform the middleware "on the fly" to enable it to be used both as a normal PSR-15 middleware and as a ReactPHP middleware. He ends the post with a word of caution and a bit of advice about using this method of rewriting - basically that just because you can doesn't mean you should.

tagged: reactphp recoil psr15 middleware translate onthefly package tutorial part3 series

Link: https://blog.wyrihaximus.net/2018/02/reactphp-with-recoilphp-party-three-http-middleware-psr-15-adapter/

Cees-Jan Kiewiet:
ReactPHP with RecoilPHP: Creating a Munin Node Client
Feb 07, 2018 @ 11:21:04

Cees-Jan Kiewiet has continued his series covering the use of RecoilPHP and ReactPHP with the second tutorial focusing on the creation of a Munin Node client.

In the previous post we've covered the basics of coroutines. In this post we're going to build a munin-node client specifically to fetch switch port traffic counters. During this post we not just write an munin-node client, we also deal with some domain logic. All code examples contain comments about what is going on and why. There is a lot of knowledge in those as well so be sure to read the comments.

He starts off by talking about his own use of the Munin system to consolidate and manage data from network switches. He then gets to the code, showing the installation of the required packages and some initial Promise setup. He then creates the basic skeleton of the Munin class and adds in the functionality to connect to the node, gather the details and fetching the list of open ports and values. Finally he puts it all together and includes a screencast of the resulting execution.

tagged: reactphp recoilphp tutorial series part2 munin node client

Link: https://blog.wyrihaximus.net/2018/02/reactphp-with-recoilphp-part-two-munin-node-client/

Cees-Jan Kiewiet:
ReactPHP with RecoilPHP: An introduction
Feb 05, 2018 @ 09:51:01

In a new post to his site Cees-Jan Kiewiet has posted in introduction to using asynchronous processing in your PHP application by using RecoilPHP and ReactPHP.

Getting your mind wrapped around async nature can be mind bending at first. But with RecoilPHP you can write code promise as if you're writing sync code.

He starts with some sample code showing the difference between normal ReactPHP and how the same kind of thing would be written using RecoilPHP. He then gets into the setup of a project that includes the RecoilPHP package and several others from React. With that base set up, he shows how to create a promise that opens a socket and listens on it for incoming messages and how to modify it to add additional coroutines. Finally he shares a few "bonus tips" and covers error handling.

tagged: recoilphp reactphp tutorial introduction asynchronous processing promise

Link: https://blog.wyrihaximus.net/2018/02/reactphp-with-recoilphp/