From: Grant Gillis <grant.j.gillis_at_gmail.com>

Date: Sat, 19 Apr 2008 13:37:36 -0700

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 19 Apr 2008 - 20:40:07 GMT

Date: Sat, 19 Apr 2008 13:37:36 -0700

I am sorry for the incorrect subject. My subject autofilled without my noticing in time. I suppose a better subject would be Calculating proportion of shared occurances and randomizations.

Grant

2008/4/19 Grant Gillis <grant.j.gillis_at_gmail.com>:

> Hello All,

*>
**> Once again thanks for all of the help to date. I am climbing my R
**> learning curve. I've got a few more questions that I hope I can get some
**> guidance on though. I am not sure whether the etiquette is to break up
**> multiple questions or not but I'll keep them together here for now as it may
**> help put the questions in context despite the fact that the post may get a
**> little long.
**>
**>
**> Question 1:
**>
**>
**> My first goal is to calculate the proportion of shared 1) behaviours and
**> 2) alleles between numerous individuals. Pasted below ('propshared'
**> function) is what I have now and and works very well for calculating the
**> proportion of shared behaviours where the data is formatted with each column
**> as a behaviour and each row an individual. Microsatellite genotypes are
**> formatted differently. An example is below. Each row is an individual and
**> each column is one allele from a single locus. From the below values L1
**> and L1.1 each give a copy of an allele for same locus. Occasionally values
**> from different loci will have the same value altough these are not actually
**> the same allele.
**>
**> I would like the calculation of the proportion of shared values for
**> alleles to be restricted to the proportion of shared alleles within loci for
**> all individuals (pairs of columns L1 and L1.1, L2 and L2.2....) What I have
**> now calculates the proportion of shared values for alleles across loci. A
**> specific example is that I would like the value *2* for individual *w *at
**> *L1* to be considered the same as the value* 2* for individual *y* at *
**> L1.1* but not the same as the value *2* for any other individual within
**> any other pair of columns.
**>
**>
**> genos<- data.frame(
**>
**> L1 = c(2,NA,1,3),
**> L1 = c(1,NA,2,3),
**> L2 = c(5,2,5,3),
**> L2 = c(3,4,2,4),
**> L3 = c(4,5,7,2),
**> L3 = c(4,6,6,6) )
**>
**> rownames(genos) = c("w","x","y","z")
**>
**> > genos
**> L1 L1.1 L2 L2.1 L3 L3.1
**> w 2 1 5 3 4 4
**> x NA NA 2 4 5 6
**> y 1 2 5 2 7 6
**> z 3 3 3 4 2 6
**>
**>
**>
**> propshared<-function(genos){
**>
**> sapply( rownames(genos), function(ind1)
**> sapply( rownames(genos), function(ind2)
**> (sum( genos[ind1,] == genos[ind2,],na.rm=TRUE )))
**> /length(genos[1,]))->x
**> is.na(diag(x))<-TRUE
**> x
**>
**> }
**>
**> > propshared(genos)
**> w x y z
**> w NA 0.0000000 0.1666667 0.1666667
**> x 0.0000000 NA 0.1666667 0.3333333
**> y 0.1666667 0.1666667 NA 0.3333333
**> z 0.1666667 0.3333333 0.3333333 NA
**>
**>
**> The matrix I would like to have would look like this.
**> w x y
**> z
**> w NA 0 0.333333333 0.166666667
**> x 0 NA 0.166666667
**> 0.166666667
**> y 0.333333333 0.166666667 NA 0.166666667
**> z 0.166666667 0.166666667 0.166666667 NA
**>
**>
**> Question 2: Thanks if you have made it this far..........Next I would
**> like to calculate a randomized value of the mean proportion of shared
**> alleles. To do this I thought I would randomize the original data (genos
**> above say 1000 times ), recalculate the proportion of shared alleles at each
**> step and then take the mean (my attempt below). When I do this I get the
**> same mean proportion of shared alleles (or behaviours) as the original for
**> every randomization. I assume that this is due to some property of
**> permuting this type of data that I do not know. Does anyone have a
**> recommendation as to how I might get a value of the proportion of shared
**> alleles if alleles were distributed (again within loci) at random?
**>
**>
**> randomize <- function(genos){
**> x <- apply(genos, 2, sample)
**> rownames(x) <- rownames(genos)
**> x
**> }
**>
**>
**> allele.permute<-function(genos, n){
**>
**> list<-replicate(n,randomize(genos), simplify = FALSE)
**> sapply(list, propshared, simplify = FALSE)
**> }
**>
**>
**>
**>
**>
**>
**> I hope this is clear. I appreciate all insights and input
**> Thanks
**>
**> Grant
**>
**>
**>
**>
*

[[alternative HTML version deleted]]

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 19 Apr 2008 - 20:40:07 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Sat 19 Apr 2008 - 22:30:30 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*