Bug #18158 for WWW-Babelfish: extract_text does not work for some queries

Subject:

extract_text does not work for some queries

For any odd reason, Babelfish does usually include the translated text as value of an <input type=hidden name="q">-tag, which WWW::Babelfish currently relies on, but reproducibly does not so for some texts, e.g. perl -MWWW::Babelfish -le '$b=new WWW::Babelfish;print$b->translate(source=>"German",destination=>"English",text=>"Neuhaus am Rennweg, Stadt")' You'll find more examples to reproduce this problem in the attachment. To have that work, I suggest changing the extract_text routine for Babelfish to the following: # Extract the text from the html we get back from babelfish # and return it extract_text => sub { my($html) = @_; my $p = HTML::TokeParser->new(\$html); while ( my $_tag = $p->get_tag('div') ) { my($tag,$attr,$attrseq,$text) = @$_tag; next unless @$attrseq == 1 && $attrseq->[-1] eq 'style' && $attr->{style} eq 'padding:10px;'; my($token) = $p->get_token or return; my ( $type, $text, $is_data ) = @$token; next if $type ne 'T'; return decode( utf8 => $text ); } } Regards, fany

Subject:

textlist

Download textlist
application/octet-stream 1.8k

Message body not shown because it is not plain text.