Skip Menu |

This queue is for tickets about the Net-Analysis CPAN distribution.

Report information
The Basics
Id: 62373
Status: open
Priority: 0/
Queue: Net-Analysis

People
Owner: Nobody in particular
Requestors: marios [...] cs.ucr.edu
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Problem decoding http contenet
Date: Fri, 22 Oct 2010 04:28:59 -0700 (PDT)
To: bug-Net-Analysis [...] rt.cpan.org
From: marios [...] cs.ucr.edu
Hi, I first want to congratulate you for this excellent libraries! I go directly to the problem: I use Net-Analysis-0.41 on perl v5.8.8 on a Linux version 2.6.18-194.11.4.el5. My goal is to extract hyperlinks from a pcap file. I use the following code: ---------------------------------------- use strict; use warnings; require Exporter; use Data::Dumper; use Net::Analysis::Dispatcher; use Net::Analysis::EventLoop; use Net::Analysis::Listener::TCP; use Net::Analysis::Listener::HTTP; my ($d) = Net::Analysis::Dispatcher->new(); my ($el) = Net::Analysis::EventLoop->new (dispatcher => $d); my $mon_obj_tcp = Net::Analysis::Listener::TCP->new(dispatcher => $d); my $mon_obj_http = Net::Analysis::Listener::HTTP->new(dispatcher => $d); my $mon_obj_base = HTTPCollector->new(dispatcher => $d); my $target = shift; die "could not read file '$target'\n" if (! -r $target); $el->loop_file (filename => $target); ---------------------------------------- The HTTPCollector object inherits from (Net::Analysis::Listener::Base) and tries to modify the http_transaction event in order to parse the HTTP requests. Inside the http_transaction function I noticed that sometimes the $resp->decoded_content() function does not decode the content. Returns undefl. I tried to find out why and I noticed that sometimes the payload itself is not a vlid gzip compressed object. This only happens when HTTP uses the “Transfer-Encoding” option set to “chunked”. if ( defined $resp->header("Transfer-Encoding") ) { if ( $resp->header("Transfer-Encoding") =~ /chunked/ ) { } } In this case the payload (content) has this particular format: <Offset in hex 1> (this i an 8 digit hex number denoting the size of the first gziped chunk) Gziped chunk <Offeset in hex 2> Gziped chunk … Is not hard to go through the content and remove the <Offset in hex *> and any new line characters that might exists there. The problem happens frequently. For example, Bing and Facebook use the Transfer-Encoding option when they use a keep-alive TCP connection for their HTTP requests (which the do so quite frequently). I currently have a code that strips the payload from these offset pointers and sovles the problem. I can send you the code if you like! :) Thank you so much for putting this together!!!! Best regards Marios
On Fri Oct 22 07:29:09 2010, marios@cs.ucr.edu wrote: Show quoted text
> Hi, > I first want to congratulate you for this > excellent libraries!
Hey, thanks ! And thanks for the bug report :) Show quoted text
> I tried to find out why and I noticed that > sometimes the payload itself is not a vlid gzip > compressed object. This only happens when HTTP > uses the “Transfer-Encoding” option set to > “chunked”.
Hmmm. There is code inside N::A::L::HTTP.pm for handling chunked encodings mostly correctly (see _unchunk_response.) Could you possibly mail me a small capture file that isn't being parsed correctly, so I can work out what's going wrong ? Thanks, - Adam PS: Apologies for taking so long to pick this up :/
Subject: Re: [rt.cpan.org #62373] Problem decoding http contenet
Date: Fri, 31 Dec 2010 15:21:29 -0800 (PST)
To: bug-Net-Analysis [...] rt.cpan.org
From: marios [...] cs.ucr.edu
Hi Adam Thanks for replying :)! Hmm ... i didn't notice the unchunk_response when I was working on this :(. Perhaps that would do the trick (even though I have already wrote my own code to handle this ... X_x). I don't have any pcaps that cpature the problem ... it has been a while since i last worked on those areas of the code. If I ran into it in the next few days I will send you one for sure! Best wishes for a Happy 2011! Marios Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=62373 > > > On Fri Oct 22 07:29:09 2010, marios@cs.ucr.edu wrote:
>> Hi, >> I first want to congratulate you for this >> excellent libraries!
> > Hey, thanks ! And thanks for the bug report :) >
>> I tried to find out why and I noticed that >> sometimes the payload itself is not a vlid gzip >> compressed object. This only happens when HTTP >> uses the “Transfer-Encoding” option set to >> “chunked”.
> > Hmmm. There is code inside N::A::L::HTTP.pm for > handling chunked encodings mostly correctly (see > _unchunk_response.) > > Could you possibly mail me a small capture file > that isn't being parsed correctly, so I can work > out what's going wrong ? > > Thanks, > > - Adam > > PS: Apologies for taking so long to pick this > up :/ > >