Looking for more information on how to do PHP the right way? Check out PHP: The Right Way

Sergey Zhuk:
Fast Web Scraping With ReactPHP. Part 2: Throttling Requests
Mar 19, 2018 @ 14:20:55

Sergey Zhuk has posted the second part of his "fast web scraping" series that makes use of the ReactPHP package to perform the requests. In part one he laid some of the groundwork for the scraper and made a few requests. In this second part he improves on this basic script and how to throttle the requests so as to not overload the end server.

t is very convenient to have a single HTTP client which can be used to send as many HTTP requests as you want concurrently. But at the same time, a bad scraper which performs hundreds of concurrent requests per second can impact the performance of the site being scraped. Since the scrapers don’t drive any human traffic on the site and just affect the performance, some sites don’t like them and try to block their access. The easiest way to prevent being blocked is to crawl nicely with auto throttling the scraping speed (limiting the number of concurrent requests). The faster you scrap, the worse it is for everybody. The scraper should look like a human and perform requests accordingly. A good solution for throttling requests is a simple queue.

He shows how to integrate the clue/mq-react package into the current scraper to interface with a RabbitMQ instance and handle the reading of and writing to the queue. He includes the code needed to update the ReactPHP client. The mq-react package makes the update simple with the HTTP client reading from the queue instance rather than the array of URLs. One the queue is integrated, he then shows how to create a "parser" that can read in the HTML and extract only the wanted data using the DomCrawler component.

tagged: http reactphp client scraping web tutorial throttle request queue imdb

Link: http://sergeyzhuk.me/2018/03/19/fast-webscraping-with-reactphp-limiting-requests/

Matt Stauffer:
Login Throttling in Laravel 5.1
Aug 03, 2015 @ 13:35:57

Matt Stauffer has posted the eleventh part in his series looking at new features of the latest release of the Laravel framework (well, version 5.1). In this tutorial he shows you how to setup and configure the login throttling for your Laravel-based application with the help of the Laravel Throttle package.

Whether or not you know it, any login forms are likely to get a lot of automated login attempts. Most login forms don't stop an automated attack trying email after email, password after password, and since those aren't being logged, you might not even know it's happening.

The best solution to something like this is to halt a user from attempting logins after a certain number of failed attempts. This is called login throttling, or rate limiting. Graham Campbell wrote a great package called Laravel Throttle to address this in previous versions of Laravel, but in Laravel 5.1 Login throttling comes right out of the box.

He shows how to use the ThrottleTrait in your AuthController to have some of the "behind the scenes" work done for you. He shows you how to update your view to relay the possible error message back to the user (and includes a quick screencast of the result). He ends the post with a quick look at what the throttling functionality is doing under the covers: creating a temporary cache item based on username+IP address as a "lock" indicator. Finally, he points out two properties you can find on the auth controller to give a bit more detail on the current configuration: lockout time and max login attempts.

tagged: laravel login throttle tutorial authcontroller laravelthrottle package cache username ipaddress

Link: https://mattstauffer.co/blog/login-throttling-in-laravel-5.1

Joshua Thijssen:
Throttle your API calls: RateLimitBundle
May 29, 2014 @ 14:02:51

In his latest post Joshua Thijssen introduces a new tool he's created to help Symfony2-based APIs handle rate limiting relatively easily: the RateLimit Bundle. The project was recently created as a part of some work he's been doing on the TechAnalyze service.

Too many times third party applications will be polling your API when they don’t really need too, and maybe you can lighten the load a bit with some heavy-duty caching, but in essence you want that every API call made matters. [...] Most of our calls are pretty lightweight, but some of them aren’t, nor are they easily cacheable. This is why we are limiting the number of calls each client can make to the API. But it wouldn’t be fair to just limit the number of calls in general.

[...] Our platform is written in PHP, based on the Symfony2 framework. There are many different bundles available for symfony2, all adding new functionality, but somehow we couldn’t find a (good) bundle for throttling our API. But after a search, we found a gist by Ruud Kamphuis, which pretty much does what we need. So we decided to set up a similar bundle, and added some flexibility in its usage.

The RateLimitBundle allows you to add a "@ratelimit" annotation directly to the controller or action in the application and adds remaining allowed calls to the response headers. The mentions some drawbacks to the bundle like a dependency on redis and how it figures out "distinct calls" to the API. He also breaks it down into the functional pieces and talks about how each one works and where it fits into the overall functionality.

tagged: ratelimitbundle symfony2 api ratelimit throttle

Link: https://www.adayinthelifeof.nl/2014/05/28/throttle-your-api-calls-ratelimitbundle

PHPClasses.org Blog:
Throttling Background Tasks: Unusual Site Speedup Techniques: Part 2
Oct 26, 2010 @ 14:17:55

On the PHPClasses.org blog Manuel Lemos has posted part two of his look at techniques to help speed up your site - a few things that you maybe hadn't thought of before.

In the previous article I talked about one important factor that often seriously affects the user perception of the speed of a site, which is the presence of content from external sites that slows down the load of pages, such as advertising and widgets. In that article I presented a technique that I am using to make external content not affect the user perception of the site speed. In this article I am addressing another factor that may also affect the user perception of site speed, but this time is related to aspects of the server side environment.

In this article he looks at things like other server-side background processes, throttling their CPU usage, throttling PHP's CPU usage and the use of a monitoring class to help you and your applications (and sysadmins) stay on top of what's happening with the server.

tagged: background task throttle site speed tutorial

Link:


Trending Topics: