On Wed Sep 22 13:08:48 2010, andy.jenkinson@gmail.com wrote:
Show quoted text> Hi,
>
> Not being familiar with the inner workings, I'm not sure if this
> can/should be resolved in WWW::Curl, but here goes:
>
> When setting CURLOPT_FOLLOWLOCATION to 1, curl will follow redirects
> and fetch the new response appropriately. However, the filehandles
> specified in CURLOPT_WRITEHEADER (and I assume CURLOPT_WRITEDATA
> but I have not tested this) are written to multiple times - once
> per server response. This means that in a typical 301/302 redirect
> situation the header filehandle will contain two headers once
> finished, ending up looking like:
>
> HTTP/1.1 301 Moved Permanently
> Location:
http://uri.of.redirect
> ... etc
>
> HTTP/1.1 200 OK
> ... etc
>
> If not addressable efficiently in WWW::Curl, I suggest listing as a
> limitation just to make others aware?
>
> Cheers,
> Andy
Hey,
I think it's default behaviour for libcurl to output all header
information, if a redirect happened and header data was requested.
The reasoning is a bit deep, but I think it's because either the
application is not interested in headers, only content - in the case
that the body gets processed, or the application is interested in the
headers (for example, to build an HTTP::Response object from it). In the
latter case, one of the reasons you want to hang on to the 301/302's
header data is that it might contain cookies (think of the "POST
/foo/login -> 301/302 -> GET /" case).
So basically the application writer has three choices to resolve this:
1. Don't bother with headers at all. Most information can be extracted
from getinfo[1] easily.
2. Disable CURLOPT_FOLLOWLOCATION and do redirect handling in the
application code. The url to redirect to is available from getinfo with
the CURLINFO_REDIRECT_URL constant. One thing to watch for is the
POST->redirect->GET behaviour that's common in browsers.
3. Implement the suggested method in the CURLOPT_HEADERFUNCTION[2]
documentation, that is delimit http responses based on the http status line.
I will be updating the documentation to link to this bugreport along a
short explanation in 4.14
[1]
http://curl.haxx.se/libcurl/c/curl_easy_getinfo.html
[2]
http://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPTHEADERFUNCTION