Re: [Rd] R-2.15.2 changes in computation speed. Numerical precision?

From: Paul Johnson <pauljohn32_at_gmail.com>
Date: Fri, 14 Dec 2012 01:01:19 -0600

On Thu, Dec 13, 2012 at 9:01 PM, Yi (Alice) Wang <yi.wang_at_unsw.edu.au> wrote:
> I have also encountered a similar problem. My mvabund package runs much
> faster on linux/OSX than on windows with both R/2.15.1 and R/2.15.2. For
> example, with mvabund_3.6.3 and R/2.15.2,
> system.time(example(anova.manyglm))
>

Hi, Alice

You have a different problem than I do.

The change from R-2.15.1 to R-2.15.2 makes the program slower on all platforms. The slowdown that emerges in R-2.15.2 on all types of hardware concerns me.

It only seemed like a "Windows is better" issue when all the Windows users who tested my program were using R-2.15.0 or R-2.15.1. As soon as they update R, then they have the slowdown as well.

pj

> on OSX returns
>
> user system elapsed
> 3.351 0.006 3.381
>
> but on windows 7 it returns
>
> user system elapsed
> 13.13 0.00 13.14
>
> I also used svd frequently in my c code though by calling the gsl functions
> only. In my memory, I think the comp time difference is not that significant
> with earlier R versions. So maybe it is worth an investigation?
>
> Many thanks,
> Yi Wang
>
>
> On Thu, Dec 13, 2012 at 5:33 PM, Uwe Ligges
> <ligges_at_statistik.tu-dortmund.de> wrote:
>>
>> Long message, but as far as I can see, this is not about base R but the
>> contributed package Amelia: Please discuss possible improvements with its
>> maintainer.
>>
>> Best,
>> Uwe Ligges
>>
>>
>> On 12.12.2012 19:14, Paul Johnson wrote:
>>>
>>> Speaking of optimization and speeding up R calculations...
>>>
>>> I mentioned last week I want to speed up calculation of generalized
>>> inverses. On Debian Wheezy with R-2.15.2, I see a huge speedup using a
>>> souped up generalized inverse algorithm published by
>>>
>>> V. N. Katsikis, D. Pappas, Fast computing of theMoore-Penrose inverse
>>> matrix, Electronic Journal of Linear Algebra,
>>> 17(2008), 637-650.
>>>
>>> I was so delighted to see the computation time drop on my Debian
>>> system that I boasted to the WIndows users and gave them a test case.
>>> They answered back "there's no benefits, plus Windows is faster than
>>> Linux".
>>>
>>> That sent me off on a bit of a goose chase, but I think I'm beginning
>>> to understand the situation. I believe R-2.15.2 introduced a tighter
>>> requirement for precision, thus triggering longer-lasting calculations
>>> in many example scripts. Better algorithms can avoid some of that
>>> slowdown, as you see in this test case.
>>>
>>> Here is the test code you can run to see:
>>>
>>>
http://pj.freefaculty.org/scraps/profile/prof-puzzle-1.R
>>>
>>> It downloads a data file from that same directory and then runs some
>>> multiple imputations with the Amelia package.
>>>
>>> Here's the output from my computer
>>>
>>> http://pj.freefaculty.org/scraps/profile/prof-puzzle-1.Rout
>>>
>>> That includes the profile of the calculations that depend on the
>>> ordinary generalized inverse algorithm based on svd and the new one.
>>>
>>> See? The KP algorithm is faster. And just as accurate as
>>> Amelia:::mpinv or MASS::ginv (for details on that, please review my
>>> notes in http://pj.freefaculty.org/scraps/profile/qrginv.R).
>>>
>>> So I asked WIndows users for more detailed feedback, including
>>> sessionInfo(), and I noticed that my proposed algorithm is not faster
>>> on Windows--WITH OLD R!
>>>
>>> Here's the script output with R-2.15.0, shows no speedup from the
>>> KPginv algorithm
>>>
>>> http://pj.freefaculty.org/scraps/profile/prof-puzzle-1-Windows.Rout
>>>
>>> On the same machine, I updated to R-2.15.2, and we see the same
>>> speedup from the KPginv algorithm
>>>
>>>
>>> http://pj.freefaculty.org/scraps/profile/prof-puzzle-1-CRMDA02-WinR2.15.2.Rout
>>>
>>> After that, I realized it is an R version change, not an OS
>>> difference, I was a bit relieved.
>>>
>>> What causes the difference in this case? In the Amelia code, they try
>>> to avoid doing the generalized inverse by using the ordinary solve(),
>>> and if that fails, then they do the generalized inverse. In R 2.15.0,
>>> the near singularity of the matrix is ignored, but not in R 2.15.2.
>>> The ordinary solve is failing almost all the time, thus triggering the
>>> use of the svd based generalized inverse. Which is slower.
>>>
>>> The Katsikis and Pappas 2008 algorithm is the fastest one I've found
>>> after translating from Matlab to R. It is not so universally
>>> applicable as svd based methods, it will fail if there are linearly
>>> dependent columns. However, it does tolerate columns of all zeros,
>>> which seems to be the problem case in the particular application I am
>>> testing.
>>>
>>> I tried very hard to get the newer algorithm described here to go as
>>> fast, but it is way way slower, at least in the implementations I
>>> tried:
>>> ## KPP
>>> ## Vasilios N. Katsikis, Dimitrios Pappas, Athanassios Petralias. "An
>>> improved method for
>>> ## the computation of the Moore Penrose inverse matrix," Applied
>>> ## Mathematics and Computation, 2011
>>>
>>> The notes on that are in the qrginv.R file linked above.
>>>
>>> The fact that I can't make that newer KPP algorithm go faster,
>>> although the authors show it can go faster in Matlab, leads me to a
>>> bunch of other questions and possibly the need to implement all of
>>> this in C with LAPACK or EIGEN or something like that, but at this
>>> point, I've got to return to my normal job. If somebody is good at
>>> R's .Call interface and can make a pure C implementation of KPP.
>>>
>>> I think the key thing is that with R-2.15.2, there is an svd-related
>>> bottleneck in the multiple imputation algorithms in Amelia. The
>>> replacement version of the function Amelia:::mpinv does reclaim a 30%
>>> time saving, while generating imputations that are identical, so far
>>> as i can tell.
>>>
>>> pj
>>>
>>>
>>>
>>
>> ______________________________________________
>> R-devel_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
>
> --
>
>
> --
> Dr. Wang, Yi (Alice)
> Research Assistant Professor
> Institute of Computational and Theoretical Studies
> Department of Computer Science
> Faculty of Science
> Hong Kong Baptist University
> Kowloon Tong, Hong Kong
> Email: yiwang_at_comp.hkbu.edu.hk
> Tel: +852-3411-2789
> Web: http://www.icts.hkbu.edu.hk/yiwang/public/
>

-- 
Paul E. Johnson
Professor, Political Science      Assoc. Director
1541 Lilac Lane, Room 504      Center for Research Methods
University of Kansas                 University of Kansas
http://pj.freefaculty.org               http://quant.ku.edu

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Fri 14 Dec 2012 - 07:09:23 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 14 Dec 2012 - 17:23:01 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive