Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the XML-RSS-LibXML CPAN distribution.

Report information
The Basics
Id: 30145
Status: open
Priority: 0/
Queue: XML-RSS-LibXML

People
Owner: Nobody in particular
Requestors: ANDK [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.3002
Fixed in: (no value)



Subject: test flaky because my.netscape.com unrelyable
I've tested 0.3002 10 times now and two times I got a fail because my.netscape.com was not answering or not answering fast enough. The error then looks like: t/items-are-0..................http://my.netscape.com/publish/formats/rss-0.91.dtd:1: parser error : Content error in the external subset <br /> ^ Can't call method "getNamespaces" on an undefined value at /home/sand/.cpan/build/XML-RSS-LibXML-0.3002-6AWmcl/blib/lib/XML/RSS/LibXML.pm line 172. # Looks like you planned 502 tests but only ran 472. # Looks like your test died just after 472. dubious ^ITest returned status 255 (wstat 65280, 0xff00) DIED. FAILED tests 473-502 ^IFailed 30/502 tests, 94.02% okay (less 23 skipped tests: 449 okay, 89.44%) Maybe you should distribute your own copy of the DTD to avoid this annoyance. Thanks,
From: DMAKI [...] cpan.org
I can't reproduce this. I don't understand why your invocation of thests would automaticaly trigger a download against an external dtd. does XML::LibXML or libxml itself have such an option?
CC: ANDK [...] cpan.org
Subject: Re: [rt.cpan.org #30145] test flaky because my.netscape.com unrelyable
Date: Sun, 21 Oct 2007 15:38:13 +0200
To: bug-XML-RSS-LibXML [...] rt.cpan.org
From: andreas.koenig.7os6VVqR [...] franz.ak.mind.de (Andreas J. Koenig)
Show quoted text
>>>>> On Sun, 21 Oct 2007 07:22:23 -0400, " via RT" <bug-XML-RSS-LibXML@rt.cpan.org> said:
Show quoted text
Show quoted text
> I can't reproduce this.
Show quoted text
> I don't understand why your invocation of thests would automaticaly > trigger a download against an external dtd.
Show quoted text
> does XML::LibXML or libxml itself have such an option?
Sure, the manpage for XML::LibXML::Parser says load_ext_dtd $parser->load_ext_dtd(1); Load external DTD subsets while parsing. This flag is also required for DTD Validation, to provide complete attribute, and to expand entities, regardless if the document has an internal subset. Thus switching off external DTD loading, will disable entity expansion, validation, and complete attributes on internal subsets as well. If you leave this parser flag untouched, everything will work, because the default is 1 (activated) I have not actually read your code, I'm only smoking it;) -- andreas
Show quoted text
> I have not actually read your code, I'm only smoking it;)
Yeah, I know ;) Thanks for the pointers. However, I never have felt the need to use load_ext_dtd() in any of my code before. So I was just wondering if there was (peeeerhaps) something that forces libxml to load external dtds without explicitly telling it to do so.
CC: ANDK [...] cpan.org
Subject: Re: [rt.cpan.org #30145] test flaky because my.netscape.com unrelyable
Date: Sun, 21 Oct 2007 16:53:15 +0200
To: bug-XML-RSS-LibXML [...] rt.cpan.org
From: andreas.koenig.7os6VVqR [...] franz.ak.mind.de (Andreas J. Koenig)
Show quoted text
>>>>> On Sun, 21 Oct 2007 09:50:32 -0400, " via RT" <bug-XML-RSS-LibXML@rt.cpan.org> said:
Show quoted text
Show quoted text
>> I have not actually read your code, I'm only smoking it;)
Show quoted text
> Yeah, I know ;)
Show quoted text
> Thanks for the pointers. > However, I never have felt the need to use load_ext_dtd() in any of my > code before.
Show quoted text
> So I was just wondering if there was (peeeerhaps) something that forces > libxml to load external dtds without explicitly telling it to do so.
It's the other way round. If you want to prevent that libxml is visiting external sites you must set load_ext_dtd(0) or something like that. I have no document that sums it up, I only heard some people have external DTDs in a local folder to speed things up when libxml once again decides to go out for shopping. Do not know the details, unfortunately. -- andreas
From: DMAKI [...] cpan.org
Thanks again for the pointers. Hmm, I tried the same tests while network is off, but it still works... Perhaps the validation is off if the request to fetch external dtds simply fails. As I can't reproduce this in my environment, I'm going to have to go research how to distinguish between the success/failure case so I can confirm that shipping the distro with local DTDs will work. It may take a bit till I get some tuits, but I'll put this in my TODO. (Hopefully I'll have some time at the end of the month)
Now I have smoked 0.3002 for a while and collected 95 PASSes and 15 FAIL s. And today I had a chance to read the XML::LibXML::Parser manpage. Here is the error message again that I get from the most recent test run: t/items-are-0....................http://my.netscape.com/publish/formats/rss-0.91.dtd:1: parser error : Content error in the external subset <br /> ^ Can't call method "getNamespaces" on an undefined value at /home/sand/.cpan/build/XML-RSS-LibXML-0.3002-NE3qqm/blib/lib/XML/RSS/LibXML.pm line 172. # Looks like you planned 502 tests but only ran 472. # Looks like your test died just after 472. Dubious, test returned 255 (wstat 65280, 0xff00) Failed 30/502 subtests ^I(less 23 skipped subtests: 449 okay) I can imagine that you cannot reproduce the error because it seems to be rare that that netscape site is broken. But 98:15 is quite a bad resultset. As what happens when you turn the network off, I don't know. But if you have network up, try this as root: tcpdump -w /tmp/tcpdump.out -s65535 -q -n 'host my.netscape.com' (which sits and waits) Then run 'make test' with your module, then INTR the tcpdump process and see what your computer talked to at netscape. And here are the two relevant snippets from said manpage: $parser->load_ext_dtd(1); $parser->load_catalog( $catalog_file ); In principle there are two strategies to deal with this problem. Either one turns off "load_ext_dtd" or one prepares a catalog and distributes it with XML::RSS::LibXML. A third alternative is to add more evals and catch more errors and improve error reporting. Which one would you prefer and how would you approach it? Thanks,