Skip Menu |

This queue is for tickets about the WWW-Curl CPAN distribution.

Report information
The Basics
Id: 35491
Status: resolved
Priority: 0/
Queue: WWW-Curl

People
Owner: Nobody in particular
Requestors: logan [...] dminteractive.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: WWW::Curl (4.0) and CURLOPT_HEADER
Date: Mon, 28 Apr 2008 17:14:21 -0500
To: bug-WWW-Curl [...] rt.cpan.org
From: Mike Hokenson <logan [...] dminteractive.com>
Hi, I just started using WWW::Curl (libwww-curl-perl 4.00-1 on unstable Debian) yesterday and noticed that the response headers are printed even if CURLOPT_HEADER is set to 0. I'm not really familiar with creating perl bindings, so the Curl.xs file doesn't make a whole lot of sense to me... My app uses a CURLOPT_HEADERFUNCTION that logs headers to a curses panel, so it's ok here. Known issue or maybe something I've done wrong? I attached test .c and .pl files that both perform the same operation to show the differences. Thanks, Mike

Message body is not shown because sender requested not to inline it.

Message body is not shown because sender requested not to inline it.

From: c.bailiff+cpan [...] devsecure.com
Hi Mike, It's nothing you've done wrong - it's a side-effect of how the module diverts the output from STDOUT into PerlIO. By default, libcurl would send the headers to STDOUT, but perl really needs it's standard output to be fed through the PerlIO layer (which might have diversions and conversions plugged into it). The XS code in the module sets up a default callback function for the libcurl header and body callbacks. The function just does a PerlIO ouput instead of a libc output on the data. It looks like the callback function is being called by libcurl even when CURLOPT_HEADER is set false/0 - this looks like libcurl is calling the callback function if it's set, even if you asked for no output. You workaround is perfectly 'correct' - put in another (dummy) callback. I'd have to think carefully about if it's possible to fix this discrepancy in the XS code - it would mean registering special cases for changes to the HEADER and NOBODY options and adding/removing the callbacks. I suspect the proper fix would be in libcurl, though I've stared at the code for a bit and don't see the logic flaw yet.. It would also be interesting to see if the same issue exists in the PHP binding, where I'm sure they'd need to catch all curl IO to pass through PHP/Apache.
Subject: Re: [rt.cpan.org #35491] WWW::Curl (4.0) and CURLOPT_HEADER
Date: Mon, 28 Apr 2008 23:19:34 -0500
To: Cris Bailiff via RT <bug-WWW-Curl [...] rt.cpan.org>
From: Mike Hokenson <logan [...] dminteractive.com>
Hi Cris, On Monday, April 28, 2008 at 06:56PM, Cris Bailiff via RT wrote: Show quoted text
> ><URL: http://rt.cpan.org/Ticket/Display.html?id=35491 > > >It's nothing you've done wrong - it's a side-effect of how the module >diverts the output from STDOUT into PerlIO. > >[snip]
Thanks for the info. Show quoted text
>You workaround is perfectly 'correct' - put in another (dummy) callback. > >I'd have to think carefully about if it's possible to fix this >discrepancy in the XS code - it would mean registering special cases for >changes to the HEADER and NOBODY options and adding/removing the >callbacks. I suspect the proper fix would be in libcurl, though I've >stared at the code for a bit and don't see the logic flaw yet..
Now I see the bit about "Redirecting the default STDOUT target for header contents" in Curl.pm, which is probably a bit better than doing the length($_[0]) for each header. Anyone who actually reads the docs (:/) when seeing this behavior should pick up on it or those familiar with libcurl would probably do what I did... I wouldn't rack your brain over this too much. The workarounds are easy and most people will probably process the headers anyway. Show quoted text
>It would also be interesting to see if the same issue exists in the PHP >binding, where I'm sure they'd need to catch all curl IO to pass through >PHP/Apache.
There's no output with the command line program (php 5.2) or to stdout/err/error_log with Apache 2.2. Thanks, Mike
Actually, it turns out this is the same bug I've been racking my brains over with for a while now. The whole thing started out as test failures for 4.00: t/08ssl.t (Wstat: 0 Tests: 14 Failed: 0) Parse errors: Tests out of sequence. Found (11) but expected (8) Tests out of sequence. Found (12) but expected (9) Tests out of sequence. Found (14) but expected (10) Tests out of sequence. Found (16) but expected (11) Tests out of sequence. Found (19) but expected (12) I couldn't reproduce it, so I asked one of the FAIL reporters (Thanks, David!) to run the raw test output for me and he was kind enough to provide me with the output. SSL debugging information, including binary garbage is displayed for some combination of libcurl/WWW::Curl/openssl by default. The only way to silence it is to set CURLOPT_DEBUGDATA to something else other than the default STDOUT (. Since this is actually causing test failures and problems for users, I'm inclined to fix this, I'm just not sure yet about the approach. Btw, this problem and a Module::Install problem (fixed in 0.72) was responsible for 90-95% of the test failures for the 4.00 release of WWW::Curl. As soon as I come up a solution for this, I'll be able to release 4.01 that fixes some outstanding issues.
From: szbalint [...] cpan.org
On Mon Apr 28 18:56:35 2008, CRISB wrote: Show quoted text
> It's nothing you've done wrong - it's a side-effect of how the module > diverts the output from STDOUT into PerlIO. > > By default, libcurl would send the headers to STDOUT, but perl really > needs it's standard output to be fed through the PerlIO layer (which > might have diversions and conversions plugged into it). > > The XS code in the module sets up a default callback function for the > libcurl header and body callbacks. The function just does a PerlIO ouput > instead of a libc output on the data.
So far, so good. Show quoted text
> > It looks like the callback function is being called by libcurl even when > CURLOPT_HEADER is set false/0 - this looks like libcurl is calling the > callback function if it's set, even if you asked for no output.
Actually, I think the case is a bit different. To fully understand what happens, let me elaborate a bit on how things are done atm, with relation to the callbacks. In 4.00, when the $curl object is created, the XS code sets up two things: it assigns the callback function pointers to all of the libcurl setopt parameters that would allow us to create callbacks and it assigns the $curl object itself to all the options accepting filestreams. There are two arrays of SVs internally to the XS module, one keeps track of callbacks set from the Perl side, the other keeps track of filehandles set from the Perl side. Then, when setopt gets called from the Perl side and a function or filehandle would be set, all that happens is that the two arrays track the filehandle/coderef. So, when libcurl actually performs the request, it calls the XS functions we've set, with the stream (that is - "self") that we've set. Our functions then check whether a function/filehandle callback exists from the Perl side and if not, then things go to STDOUT. The problem we have is, that libcurl sees that we've set CURLOPT_HEADERFUNCTION, CURLOPT_WRITEHEADER, CURLOPT_DEBUGDATA, etc. and it assumes we want the output. However, this is between XS<-->libcurl. So when no callbacks are set between XS<-->Perl, then things end up on STDOUT through PerlIO as the default catchall in the XS function is that. This is why we've seen header and debug output on STDOUT even if CURLOPT_HEADER is set to a false value. Imo, the fix is to only set these callbacks (but both function and filehandle at the same time - they depend on each other!) when they are set from the Perl side, and appropriately unset them if the Perl side assigns undef. CURLOPT_READFUNCTION and CURLOPT_WRITEFUNCTION should still be set to our custom XS function at object creation time, because this allows us to feed the default libcurl output through PerlIO. So, in the cases that interest us this would happen: 1. No curl option is set from the Perl side: libcurl tries to output to STDOUT, the default WRITEFUNCTION redirects to PerlIO. 2. CURLOPT_*DATA is set, but not CURLOPT_*FUNCTION: libcurl tries to output data to the stream. The stream is set to the XS $self, libcurl checks for the specific callback, which is executed, which checks the internal array and writes to the appropriate filehandle. 3. Both *DATA and *FUNCTION are set: Same as before, except instead of a filehandle being written to, the Perl-side coderef gets executed and it's return value passed back to libcurl as required. What do you think? If no objections, I'll modify the code to behave like this and release 4.01.
There are still some problems with duphandle, which I'll fix in the coming days, however the rest seems to be working fine. I'm marking this bug as resolved.