Looking for more information on how to do PHP the right way? Check out PHP: The Right Way

Matthew Turland's Blog:
Gotcha on Scraping .NET Applications with PHP and cURL
Jul 01, 2010 @ 09:51:36

New on his blog today Matthew Turland has posted about a "gotcha" he came across when working with cURL to pull down information (scrape content) from a remote .NET application.

I recently wrote a PHP script to scrape data from a .NET application. In the process of developing this script, I noticed something interesting that I thought I’d share. In this case, I was using the cURL extension, but the tip isn’t necessarily specific to that. One thing my script did was submit a POST request to simulate a form submission. [...] The issue I ran into had to do with a behavior of the CURLOPT_POSTFIELDS setting that’s easy to overlook.

The problem was something cURL does automatically - change the header for the content type because you're sending an array. Thankfully, with the help of a call to http_build_query to encode it correctly, the request will use the right headers.

tagged: net application scrape content gotcha curl