Skip Menu |

This queue is for tickets about the Statistics-R CPAN distribution.

Report information
The Basics
Id: 97359
Status: resolved
Priority: 0/
Queue: Statistics-R

People
Owner: Nobody in particular
Requestors: ken [...] knowledgesynthesis.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: set() problems with large arrays
Date: Sat, 19 Jul 2014 23:17:01 -0700
To: bug-Statistics-R [...] rt.cpan.org
From: Ken Yamaguchi <ken [...] knowledgesynthesis.com>
Using Statistics-R-0.32's set() for a 50,000-element list is very slow and/or triggers "Resource temporarily unavailable: write(...) at /usr/share/perl5/IPC/Run/IO.pm line 558" on Debian 7.6 (wheezy) (IPC-Run-0.92). The problem appears to be R's handling of newlines in the c() function. Feeding a c() with 50,000 doubles, each on a new line, to Rscript took 30 minutes to process. The same doubles on one line take 5 seconds. Please see attached script to reproduce as well as the trivial patch (which doesn't appear to cause trouble with any R max line limits, at least on Debian wheezy). perl -v: This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-gnu-thread-multi R --version: R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows" Platform: x86_64-pc-linux-gnu (64-bit) uname -a: Linux laptop 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u1 x86_64 GNU/Linux

Message body is not shown because sender requested not to inline it.

Message body is not shown because sender requested not to inline it.

Hi Ken, Thanks for the bug report and patch. I have applied the patch in the development repository. I can confirm that Statistics::R manages to reads large arrays with your patch, and does so faster. The line limit is not triggered; this may be an issue specific to older versions of R (I run 3.1.1)? Cheers, Florent
Subject: Re: [rt.cpan.org #97359] set() problems with large arrays
Date: Tue, 19 Aug 2014 20:58:27 -0700
To: Florent Angly via RT <bug-Statistics-R [...] rt.cpan.org>
From: Ken Yamaguchi <ken [...] knowledgesynthesis.com>
Hi Florent, While the line limit indeed seems to be an issue with only older versions of R (at least pre-2.15.0), I seem to have hit another bizarre R line-reading problem with the patch applied. See https://bugs.r-project.org/bugzilla/show_bug.cgi?id=15941 At a very particular line length, R silently exits when using the invocation that Statistics::R uses. The attached patch resolves at least my particular test case. I suspect a combination of a magic line length (involving some near multiple of 4096) and multiple statements, although we'll need help from the R folks to know for sure. Apologies if this information should have gone in a new report. Thanks, Ken

Message body is not shown because sender requested not to inline it.