Re: [R] How do I make this faster?

From: Paul Hiemstra <paul.hiemstra_at_knmi.nl>
Date: Mon, 11 Apr 2011 11:13:12 +0200

On 04/11/2011 10:28 AM, Andreas Borg wrote:
> Hi Hasan,
>
> I'd be happy to help you, but I am not able to run your code. You use
> commandArgs to retrieve arguments of the R program, but which ones do
> you actually provide?
>
> Best regards,
>
> Andreas
>
> Hasan Diwan schrieb:
>> I was on vacation the last week and wrote some code to run a 500-day
>> correlation between the Nasdaq tracking stock (QQQ) and 191 currency
>> pairs
>> for 500 days. The initial run took 9 hours(!) and I'd like to make it
>> faster. So, I'm including my code below, in hopes that somebody will
>> be able
>> to figure out how to make it faster, either through parallelisation,
>> or by
>> making changes. I've marked the places where Rprof showed me it was
>> slowing
>> down:
>> currencyCorrelation <- function(lagtime = 1) {
>> require(quantmod)
>>
>> dataTrack <- getSymbols(commandArgs(trailingOnly=T)[1],
>> from='2009-11-21',
>> to='2011-04-03')
>> stockData <- get(dataTrack)
>> currencies <- row.names(oanda.currencies[grep(pattern='oz.',
>> fixed=T, x
>> =as.vector(oanda.currencies$oanda.df.1.length.oanda.df...2....1.)) ==
>> F])
>> correlations <- vector()
>> values <- list()
>> # optimise these loops using the apply family
>> for (i in currencies) {
>> for (j in currencies) {
>> if (i == j) next()
>> fx <- getFX(paste(i, j, sep='/'), from='2009-11-20',
>> to='2011-04-02')
>> # Prepare data by getting rates for market days only
>> fx <- get(fx)
>> fx <- fx[which(index(fx) %in% index(QQQ$QQQ.Close))]
>> correlation <- cor(fx, QQQ$QQQ.Close)
>> correlations <- c(correlations, correlation)
In this piece of code you concatenate correlation and correlations. Because you dynamically change correllations the operating system is looking for a spot of memory for the object often. Preallocating the space you need, or a bit is also fine, will make this much faster. You can do this by not creating zero-length vectors for 'correlations' and 'vectors' before the start of the loop, but creating them already at the desired length and assign values in the loop, not concatenate. This could possibly speed up your codes by several orders of magnitude.

cheers,
Paul
>> string <- paste(paste(i,j,sep='/'), correlation, sep=',')
>> values <- c(values,paste(string,'\n', sep=''))
>> }
>> }
>> # TODO eliminate NA's
>> values <- values[which(correlations[is.na(correlations) == F])]
>> correlations <- correlations[is.na(correlations) == F]
>> values <- values[order(correlations, decreasing=T)]
>> write.table(values, file=commandArgs(trailingOnly=T)[2], sep='',
>> qmethod=NULL, quote = F, row.names=F, col.names=F)
>> rm('currencies', 'correlations', 'values', 'fx', 'string')
>> return()
>> }
>> lagtime <- as.integer(commandArgs(trailingOnly=T)[3])
>> if (is.na(lagtime)) lagtime <- 1
>> print(paste(Sys.time(), '<--- starting', lagtime, 'day lag currencies
>> correlation with', commandArgs(trailingOnly=T)[1], 'from 2009-11-20 to
>> 2011-04-03'))
>> currencyCorrelation(lagtime)
>> print(paste(Sys.time(), '<--- ended, results in',
>> commandArgs(trailingOnly=T)[2]))
>>
>>
>>
>
>

-- 
Paul Hiemstra, MSc
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Mon 11 Apr 2011 - 09:15:31 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 11 Apr 2011 - 10:00:28 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive