Re: [R] R Newbie, please help!

From: Joshua Wiley <jwiley.psych_at_gmail.com>
Date: Thu, 03 Jun 2010 21:18:22 -0700

Hello Jeff,

Try this:

test <- data.frame(totret=rnorm(10^7)) #create some sample data test[-1,"dailyreturn"] <- test[-1,"totret"]/test[-nrow(test),"totret"]

The general idea is to take the column "totret" excluding the first 1, dividided by "totret" exluding the last row. This gives in effect t+1 (since t is now shorter)/t

I assigned the result to a new column "dailyreturn". For 10^7 rows, it tooks 1.92 seconds on my system.

HTH, Josh

On Thu, Jun 3, 2010 at 8:04 PM, Jeff08 <jefferyding_at_gmail.com> wrote:
>
> Hello Everyone,
>
> I just started a new job & it requires heavy use of R to analyze datasets.
>
> I have a data.table that looks like this. It is sorted by ID & Date, there
> are about 150 different IDs & the dataset spans 3 million rows. The main
> columns of concern are ID, date, and totret. What I need to do is to derive
> daily returns for each ID from totret, which is simply totret at time t+1
> divided by totret at time t.
>
>              X       id ticker      date_ adjClose    totret RankStk
> 427225   427225 00174410    AHS 2001-11-13    21.66 100.00000    1235
> 441910   441910 00174410    AHS 2001-11-14    21.60  99.72300    1235
> 458458   458458 00174410    AHS 2001-11-15    21.65  99.95380    1235
> 284003   284003 00174410    AHS 2001-11-16    21.59  99.67680    1235
>
> Two problems for me:
>
> 1)I can't just apply it to the entire column since there will be problems at
> the boundary points where the ID changes from 1 to another. I need to find
> out how to specify a restriction on the name of the ID
>
> 2) From Java, instinctively I would use a loop to calculate daily returns,
> but I found out that R is very slow with loops, so I need to find an
> efficient way to calculate  daily returns with such a huge dataset.
>
> Thanks a lot!
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/R-Newbie-please-help-tp2242633p2242633.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Joshua Wiley
Senior in Psychology
University of California, Riverside
http://www.joshuawiley.com/

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 04 Jun 2010 - 04:21:27 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 04 Jun 2010 - 06:10:28 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive