[R] how to check if a variable is preferentially present in a sample

From: Tania Oh <tania.oh_at_bnc.ox.ac.uk>
Date: Tue, 08 Apr 2008 16:24:25 +0100

Dear All,

I do apologise if this question is out of place for this list but I've tried searching mailing lists and read "Introductory Statistics with R" by Peter Dalgaard, but couldn't find any hints on solving my question below:

I have a data frame (d) of values which I will rank in decreasing order of "val". Each value belongs to a group, either 'A', 'B', 'C', 'D', or 'E'. I then take the first 10 entries in data frame 'd' and count the number of occurrences for each of the groups. I want to test if certain groups occur more frequently than by chance in my first 10 entries. Would a chi-square test or a hypergeometric test be more suitable? If neither, what would be an alternative solution in R? Below is my data:

## data
L5 <- LETTERS[1:5]
d <- data.frame(cbind(val= rnorm(1:10)^2, group=sample(L5,100, repl=TRUE)))

##'data.frame': 100 obs. of 2 variables:
##$ val : Factor w/ 10 levels "0.000169268449333046",..: 10 3 5 6 1 2
7 8 4 9 ...
##$ group: Factor w/ 5 levels "A","B","C","D",..: 4 4 4 5 3 1 5 2 1
2 ...

Many thanks in advance and apologies again, tania

D. phil student
Department of Physiology, Anatomy and Genetics
University of Oxford

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 08 Apr 2008 - 15:33:04 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 08 Apr 2008 - 21:30:34 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive