Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the AnyEvent-Twitter-Stream CPAN distribution.

Report information
The Basics
Id: 64471
Status: resolved
Priority: 0/
Queue: AnyEvent-Twitter-Stream

People
Owner: Nobody in particular
Requestors: bryanpaluch [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: JSON parsing errors
Date: Tue, 4 Jan 2011 15:45:38 -0500
To: bug-AnyEvent-Twitter-Stream [...] rt.cpan.org
From: Bryan Paluch <bryanpaluch [...] gmail.com>
Hello, When I try to run the example tracker.pl code with my own consumer key I get issues with parsing JSON objects. There is normally 3 to 4 characters before the initial { sign and I think that this throwing off the JSON parser. eg: malformed JSON string, neither array, object, number, string or atom, at character offset 0 (before "DD8") at /home/bpaluch/perl/lib/AnyEvent/Twitter/Stream.pm line 158. JSON text must be an object or array (but found number, string, true, false or null, use allow_nonref to allow this) at /home/bpaluch/perl/lib/AnyEvent/Twitter/Stream.pm line 158. garbage after JSON object, at character offset 1 (before "A5") at /home/bpaluch/perl/lib/AnyEvent/Twitter/Stream.pm line 158. using .20 with v5.10.1 (*) built for i486-linux-gnu-thread-multi on Linux 2.6.32.24+drm33.11-custom-uvc #1 SMP F (Ubuntu 10.04) Thanks Bryan Paluch
From: znmeb [...] borasky-research.net
On Tue Jan 04 15:45:49 2011, bryanpaluch@gmail.com wrote: Yeah, I'm seeing the same thing. The code worked last year, so I'm guessing Twitter changed the format of what they're sending down the pipe slightly. I can run this with Komodo and capture raw data if necessary.
From: znmeb [...] borasky-research.net
On Wed Feb 16 19:32:23 2011, znmeb@borasky-research.net wrote: Show quoted text
> On Tue Jan 04 15:45:49 2011, bryanpaluch@gmail.com wrote: > > Yeah, I'm seeing the same thing. The code worked last year, so I'm > guessing Twitter changed the format of what they're sending down the
pipe Show quoted text
> slightly. I can run this with Komodo and capture raw data if necessary.
Actually, I have a successful capture with time stamp 2011-01- 01T08:56:00Z. I shut that collector down, so I don't know when things changed, but this code is dated October 2010. I'm not sure which version of AnyEvent::Twitter::Stream I was running at the time, though - maybe I should try the previous version.
From: znmeb [...] borasky-research.net
Hmmm ... I think the "length" field may be what's either changed or newly working. Someone just posted this on the developers' mailing list: https://groups.google.com/d/topic/twitter-development- talk/US_ATDzMw8c/discussion
Just a wild guess: AnyEvent::HTTP now uses http/1.1 requests, and AnyEvent::Twitter::Strream uses want_body_fh instead of on_body, so it now has to parse the resulting chunked-encoded stream, which it doesn't. So likely neither AnyEvent::Twitter::Stream nor twitter changed anything, but AnyEvent::HTTP does http/1.1 instead of http/1.0, and AnyEvent::Twitter::Stream doesn't cope with that (as it should have used on_body).
From: znmeb [...] borasky-research.net
On Thu Feb 17 10:01:27 2011, MLEHMANN wrote: Show quoted text
> Just a wild guess: AnyEvent::HTTP now uses http/1.1 requests, and > AnyEvent::Twitter::Strream uses want_body_fh instead of on_body, so it > now has to parse the resulting chunked-encoded stream, which it
doesn't. Show quoted text
> > So likely neither AnyEvent::Twitter::Stream nor twitter changed > anything, but AnyEvent::HTTP does http/1.1 instead of http/1.0, and > AnyEvent::Twitter::Stream doesn't cope with that (as it should have
used Show quoted text
> on_body). > >
Could be - I have a little more information gleaned from running with Komodo but I don't have the window up at the moment. 1. The JSON response is coming back from Twitter into $handle->{rbuf}. 2. Sometimes the JSON response has extra spaces at the end, which is croaking the JSON parser. 3. Sometimes the JSON response is "\r\n" which is croaking the parser. I can hack the code to work in Komodo, which I'm planning to do either later tonight or some time this weekend, but we really should engage the author of the module and Twitter to validate what's going on - I don't want to fork this module based on my Komodo hacks and what's currently coming down Twitter's pipes. I'd like to know what changed and when, since some version of this was working on January 1, 2011. ;-)
well, spaces at the end will not cause a json parser to croak - it's valid json. instead of hacking it to parse chunked encoding (which will break it as soon as there are any http changes again), it should use on_body and an incremental parser, only that way is it future proof against low-level changes in AnyEvent::HTTP (want_body_fh == raw stream, on_body == proper http response).
Unfortunately the AnyEvent::HTTP POD still claims (on_body) .. 'is usually preferred over doing your own body handling via "want_body_handle", but in case of streaming APIs, where HTTP is only used to create a connection, "want_body_handle" is the better alternative, as it allows you to install your own event handler, reducing resource usage.' and (want_body _handle) '.. is useful with some push-type services, where, after the initial headers, an interactive protocol is used (typical example would be the push-style twitter API which starts a JSON/XML stream)'.

Also, in the code of AnyEvent::Twitter::Stream is this: "want_body_handle => 1, # for some reason on_body => sub {} doesn't work :/", though patching the module to use on_body does seem to work for me, something like this:

            want_body_handle => 0,
            on_body => sub {
                my ($content, $headers) = @_;
                $set_timeout->();
                do {
                    $on_keepalive->();
                    return;
                } unless $content;
                $self->{_json} ||= JSON->new;
                for my $tweet ($self->{_json}->incr_parse($content)) {
                    if ($on_delete && $tweet->{delete} && $tweet->{delete}->{status}) {
                        $on_delete->($tweet->{delete}->{status}->{id}, $tweet->{delete}->{status}->{user_id});
                   }elsif($on_friends && $tweet->{friends}) {
                        $on_friends->($tweet->{friends});
                   }elsif($on_event && $tweet->{event}) {
                        $on_event->($tweet);
                   }else{
                        $on_tweet->($tweet);
                   }
                }
                1;
       },

cleanup and error handling omitted.

From: znmeb [...] borasky-research.net
On Fri Feb 18 03:10:52 2011, JWRIGHT wrote: Show quoted text
> Unfortunately the AnyEvent::HTTP POD still claims (on_body) .. 'is > usually > preferred over doing your own body handling via "want_body_handle", > but in case > of streaming APIs, where HTTP is only used to create a connection, > "want_body_handle" is the better alternative, as it allows you to > install your > own event handler, reducing resource usage.' and (want_body _handle) > '.. is > useful with some push-type services, where, after the initial headers, > an > interactive protocol is used (typical example would be the push-style > twitter > API which starts a JSON/XML stream)'. > > Also, in the code of AnyEvent::Twitter::Stream is this: > "want_body_handle => 1, > # for some reason on_body => sub {} doesn't work :/", though patching > the > module to use on_body does seem to work for me, something like this: > > want_body_handle => 0, > on_body => sub { > my ($content, $headers) = @_; > $set_timeout->(); > do { > $on_keepalive->(); > return; > } unless $content; > $self->{_json} ||= JSON->new; > for my $tweet ($self->{_json}->incr_parse($content)) { > if ($on_delete && $tweet->{delete} && $tweet->{delete}->{status}) { > $on_delete->($tweet->{delete}->{status}->{id}, > $tweet->{delete}->{status}->{user_id}); > }elsif($on_friends && $tweet->{friends}) { > $on_friends->($tweet->{friends}); > }elsif($on_event && $tweet->{event}) { > $on_event->($tweet); > }else{ > $on_tweet->($tweet); > } > } > 1; > }, > > cleanup and error handling omitted.
Well, the date of the AnyEvent::HTTP change is consistent with the date this bug report was opened. ;-) I sent an email to the author of AnyEvent::Twitter::Stream.
From: znmeb [...] borasky-research.net
As far as I know this has been fixed. How do we mark it as closed?