From: ronggui <0034058_at_fudan.edu.cn>

Date: Mon 25 Jul 2005 - 18:32:20 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon Jul 25 18:36:25 2005

Date: Mon 25 Jul 2005 - 18:32:20 EST

Yeah,I meant n=10000,but i just missed a zero.

If n=10000,t=3,it take about 3 seconds.

If n=2000,t=7,it takes about 10 seconds.
I want to write a function to fit a model,and the data maybe quite large (maybe n<=20000,t<=10).When n and t become larger and larger,the time will be much longer. It is of course reasonable.But I think there should be much pretty code to do this job,so I post here.

What I really want to konw is how to optimize the code for this purpose.Of course, I can still fit my model even I use this code.and I still like R much as it's free ,flexibile and powerfull.

*>> if n id quite large,say n=1000 and t=3, it require too much time.so i
**>> want to know any more efficient way to do it?
**>
*

>Why is about 0.4 second (which is what it takes on my system) too long?

*>
**>Given that you want to operate on 3000 cells, a second does not look
**>unreasonable.
**>
**>This is a toy problem, and it is unclear what the real problem is (if
**>any). Since you have the same number of replications for each cell
**>(group-variable combination)
*

I want to deal with the case with different number of replications for each cell too.

> I would use this as a n x 3 x t array (a

*>simple call to dim and aperem). Then rowMeans will find the group means,
**>and you can just subtract those to get the deviations from the means,
**>making use of recycling.
**>
**>E.g.
**>
**>D <- d[,-1]
**>dim(D) <- c(t,n,3)
**>D <- aperm(D, c(2,3,1))
**>gmeans <- rowMeans(D, dims=2)
**>d[,-1] - rep(gmeans, each=3)
**>
**>That takes under 10ms for n=1000
**>
**>
**>On Mon, 25 Jul 2005, ronggui wrote:
**>
**>>> n=10;t=3
**>>> d<-cbind(id=rep(1:n,each=t),y=rnorm(n*t),x=rnorm(n*t),z=rnorm(n*t))
**>>> head(d)
**>> id y x z
**>> [1,] 1 -2.1725379 0.07629954 -0.3985258
**>> [2,] 1 -1.2383038 -2.49667038 0.6966127
**>> [3,] 1 -1.2642401 -0.50613307 0.4895856
**>> [4,] 2 0.2171246 0.86711864 -0.6660036
**>> [5,] 2 2.2765760 -0.48547142 -1.4496664
**>> [6,] 2 0.5985345 -1.06427035 2.1761071
**>>
**>> first,i want to get the group mean of each variable,which i can use
**>>> d<-data.frame(d)
**>>> aggregate(d,list(d$id),mean)[,-1]
**>> id y x z
**>> 1 1 -1.55836060 -0.9755013 0.26255754
**>> 2 2 1.03074502 -0.2275410 0.02014565
**>> 3 3 0.20700121 -0.7159450 1.35890176
**>> 4 4 0.17839650 1.2575891 0.04135165
**>> 5 5 -0.20012508 0.4310221 0.55458899
**>> 6 6 -0.13084185 -0.2953392 0.28229068
**>> 7 7 0.20737288 -0.8863761 -0.50793880
**>> 8 8 0.07512612 -0.6591304 -0.21656533
**>> 9 9 0.94727796 -0.6108891 0.13529884
**>> 10 10 -0.04434875 0.1332086 -0.88229808
**>>
**>> then i want the group mean deviation data,like
**>>> head(sapply(d[,2:4],function(x) x-ave(x,d$id)))
**>> y x z
**>> [1,] -0.6141773 1.0518008 -0.6610833
**>> [2,] 0.3200568 -1.5211691 0.4340552
**>> [3,] 0.2941205 0.4693682 0.2270281
**>> [4,] -0.8136205 1.0946597 -0.6861493
**>> [5,] 1.2458310 -0.2579304 -1.4698121
**>> [6,] -0.4322105 -0.8367293 2.1559614
**>>
**>> both above are what i want.though i can do it use the function to do it.but if n id quite large,say n=1000 and t=3, it require too much time.so i want to know any more efficient way to do it?
**>>
**>> myfun<-function(x,id)
**>> {
**>> x<-as.matrix(x)
**>> id<-as.factor(id)
**>> xm<- apply(x,2,function(y,z) tapply(y,z, mean), z=id)
**>> xdm<- x[] <- x-xm[id,]
**>> re<-list(xm=xm, xdm=xdm)
**>> re
**>> }
**>>
**>>
**>
**>--
**>Brian D. Ripley, ripley@stats.ox.ac.uk
**>Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
**>University of Oxford, Tel: +44 1865 272861 (self)
**>1 South Parks Road, +44 1865 272866 (PA)
**>Oxford OX1 3TG, UK Fax: +44 1865 272595
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon Jul 25 18:36:25 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:34:01 EST
*