Subject: | set() problems with large arrays |
Date: | Sat, 19 Jul 2014 23:17:01 -0700 |
To: | bug-Statistics-R [...] rt.cpan.org |
From: | Ken Yamaguchi <ken [...] knowledgesynthesis.com> |
Using Statistics-R-0.32's set() for a 50,000-element list is very slow and/or triggers "Resource temporarily unavailable: write(...) at /usr/share/perl5/IPC/Run/IO.pm line 558" on Debian 7.6 (wheezy) (IPC-Run-0.92).
The problem appears to be R's handling of newlines in the c() function. Feeding a c() with 50,000 doubles, each on a new line, to Rscript took 30 minutes to process. The same doubles on one line take 5 seconds. Please see attached script to reproduce as well as the trivial patch (which doesn't appear to cause trouble with any R max line limits, at least on Debian wheezy).
perl -v:
This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-gnu-thread-multi
R --version:
R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows"
Platform: x86_64-pc-linux-gnu (64-bit)
uname -a:
Linux laptop 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u1 x86_64 GNU/Linux
Message body is not shown because sender requested not to inline it.
Message body is not shown because sender requested not to inline it.