Hi all,
I am a real novice to R. :)
I am struggling with a problem for generating colored dendrogram. I have searched the R list and complied/collected a R code which can generated a colored dendrogram based on the rainbow color and 4x4 similarity matrix (say matrix:m).
In this dendrogram, each leaf is colored differently. But, I do not want the
leaf colored on a random basis. I want to assign color for leaf on basis of
some criterion.
My actual problem: How to generate a dendrogram with the leaf colored
according to the values in the second matrix (which is 4x1 dim, say matrix
c)? Meaning, the leaf 1 and 2 should be colored in neighboring spectra of
color (say different shade of red) and leaf 3 and 4 in a different color
(say different shade of blue)
Below is R code and data matrix
Any help would be highly appreciated.
Thanks
Om Prakash
Bioinformatics Center
University of Pune
India
R code:
hc<-hcluster(m, link="ave")
dhc<-as.dendrogram(hc)
dendrapply(dhc, function(n) str(attributes(n)))
local({
colLab <<- function(n){
if(is.leaf(n)){
a <- attributes(n)
i <<- i+1
attr(n, "nodePar") <- c(a$nodePar, list(lab.col = mycols[i], lab.font=i%%3))
}
n
}
#mycols <- grDevices::rainbow(attr(dhc,"members"))
i <- 0
})
dL <- dendrapply(dhc, colLab)
plot(dL)
On 12/26/06, r-help-request@stat.math.ethz.ch < r-help-request@stat.math.ethz.ch> wrote:
> > Send R-help mailing list submissions to > r-help@stat.math.ethz.ch > > To subscribe or unsubscribe via the World Wide Web, visit > https://stat.ethz.ch/mailman/listinfo/r-help > or, via email, send a message with subject or body 'help' to > r-help-request@stat.math.ethz.ch > > You can reach the person managing the list at > r-help-owner@stat.math.ethz.ch > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of R-help digest..." > > > Today's Topics: > > 1. Re: how to 'get' an object that is part of a list > (Gabor Grothendieck) > 2. Re: extend summary.lm for hccm? (Achim Zeileis) > 3. Re: extend summary.lm for hccm? (John Fox) > 4. Hmisc - some latex problems (steve) > 5. Re: Hmisc - some latex problems (Frank E Harrell Jr) > 6. Problem to generate training data set and test data set > (Aimin Yan) > 7. No fonts in graphics using GDD (Miguel Tremblay) > 8. Re: how to 'get' an object that is part of a list > (Christos Hatzis) > 9. Bayesian data mining (Frank Grex) > 10. Bayesian data mining (fwd) (Frank Grex) > 11. defining color sequence in image() (Milton Cezar Ribeiro) > 12. Re: Problem to generate training data set and test data set > (Jim Lemon) > 13. Re: Higher Dimensional Matrices (Bill.Venables@csiro.au) > 14. sequential row selection in dataframe (Pedro Mardones) > 15. Re: Higher Dimensional Matrices (downunder) > 16. Merry Xmas (justin bem) > 17. Re: defining color sequence in image() (talepanda) > 18. HOW TO: vectorize an iterative process. (Geoffrey Zhu) > 19. writing to S-PLUS .dat file (Sebastian Michalski) > 20. Re: extend summary.lm for hccm? (Achim Zeileis) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 25 Dec 2006 08:33:54 -0500 > From: "Gabor Grothendieck" <ggrothendieck@gmail.com> > Subject: Re: [R] how to 'get' an object that is part of a list > To: christos@nuverabio.com > Cc: r-help@stat.math.ethz.ch > Message-ID: > <971536df0612250533n6ba6e8f4o75975036512154ae@mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > my.length.2 also has the advantage of eliminating the eval. > > On 12/25/06, Christos Hatzis <christos@nuverabio.com> wrote: > > Thank you Gabor. Very interesting solution. > > If I get it right, the first argument in function f is just a > placeholder to > > help extract the right element out of the list(...) that is passed to > > length. Very smart trick. > > > > Jim's solution appears a bit simpler, at least along the lines that I > was > > thinking: > > > > my.length <- function(...) { > > names <- as.character(substitute(list(...)))[-1] > > sapply(names, function(x){y <- eval(parse(text=x)); length(y)}) > > } > > > > -Christos > > > > -----Original Message----- > > From: Gabor Grothendieck [mailto:ggrothendieck@gmail.com] > > Sent: Sunday, December 24, 2006 7:41 AM > > To: jim holtman > > Cc: christos@nuverabio.com; > > Subject: Re: [R] how to 'get' an object that is part of a list > > > > Is this what you are looking for: > > > > > my.length.2 <- function(...) { > > + f <- function(nm, val) length(val) > > + mapply(f, make.names(as.list(match.call()[-1])), list(...)) } > > > my.length.2(xx, xx$b) > > xx xx.b > > 2 5 > > > > On 12/24/06, jim holtman <jholtman@gmail.com> wrote: > > > use 'eval' and 'parse' > > > > > > > > > > xx > > > $a > > > [1] 1 2 3 4 5 > > > > > > $b > > > [1] "a" "b" "c" "d" "e" > > > > > > > eval(parse(text='xx$a')) > > > [1] 1 2 3 4 5 > > > > > > > > > > So that way you can pass in the character string and then 'parse' it. > > > > > > > > > > > > On 12/24/06, Christos Hatzis <christos@nuverabio.com> wrote: > > > > > > > > This might be an trivial thing but I am stuck. > > > > > > > > Consider: > > > > > > > > xx <- list(a=1:5, b=letters[1:5]) > > > > > > > > Although object xx is accessible through its name, how can object > > > > xx$b be accessed similarly through its name? > > > > > > > > > get("xx") > > > > $a > > > > [1] 1 2 3 4 5 > > > > > > > > $b > > > > [1] "a" "b" "c" "d" "e" > > > > > > > > > get("xx$b") > > > > Error in get(x, envir, mode, inherits) : variable "xx$b" was not > > > > found > > > > > > > > get("xx")$b will not work in my case because it will probably > > > > require parsing to make it work within a function. E.g. > > > > > > > > my.length <- function(...) { > > > > names <- as.character(substitute(list(...)))[-1] > > > > sapply(names, FUN=function(x){y <- get(x); length(y)}) } > > > > > my.length(xx) > > > > xx > > > > 2 > > > > > my.length(xx$a) > > > > Error in get(x, envir, mode, inherits) : variable "xx$a" was not > > > > found > > > > > my.length(xx$a, xx$b) > > > > Error in get(x, envir, mode, inherits) : variable "xx$a" was not > > > > found > > > > > > > > Thank you. > > > > > > > > Christos Hatzis, Ph.D. > > > > Nuvera Biosciences, Inc. > > > > 400 West Cummings Park > > > > Suite 5350 > > > > Woburn, MA 01801 > > > > Tel: 781-938-3830 > > > > www.nuverabio.com > > > > > > > > ______________________________________________ > > > > R-help@stat.math.ethz.ch mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > > > > > > -- > > > Jim Holtman > > > Cincinnati, OH > > > +1 513 646 9390 > > > > > > What is the problem you are trying to solve? > > > > ------------------------------ > > Message: 2 > Date: Mon, 25 Dec 2006 14:38:07 +0100 (CET) > From: Achim Zeileis <Achim.Zeileis@wu-wien.ac.at> > Subject: Re: [R] extend summary.lm for hccm? > To: John Fox <jfox@mcmaster.ca> > Cc: r-help@stat.math.ethz.ch, 'ivo welch' <ivowel@gmail.com>, 'Dirk > Eddelbuettel' <edd@debian.org> > Message-ID: > <Pine.LNX.4.44.0612251406500.8229-100000@disco.wu-wien.ac.at> > Content-Type: TEXT/PLAIN; charset=US-ASCII > > On Sun, 24 Dec 2006, John Fox wrote: > > > Dear Dirk and Ivo, > > > > It's true that the sandwich package has more extensive facilities for > > sandwich variance estimation than the hccm function in the car package, > but > > I think that the thrust of Ivo's question is to get a model summary that > > automatically uses the adjusted standard errors rather than having to > > compute them and use them "manually." > > I've written the function coeftest() in package "lmtest" for this purpose. > With this you can > coeftest(obj, vcov = sandwich) > or you can put this into a specialized summary() function as John > suggested (where you probably would want the F statistic to use the other > vcov as well). See also function waldtest() in "lmtest", > linear.hypothesis() in "car" and > vignette("sandwich", package = "sandwich") > > Although this works, it is still a nuisance to use a different function > and not summary() directly. In addition, it would also be nice to plug in > different vcovs into confint() or predict() methods. Of course, one could > write different generics or overload the methods in certain packages. > However, I guess that many practitioners want to use different vcov > estimators - especially in the social and political scieneces, and > econometrics etc. - so that this might justify that the original "lm" (and > "glm") methods are extended to allow for plugging in different vcov > matrices. Maybe we could try to convince R-core to include somthing like > this? > > Best wishes, > Z > > > > ------------------------------ > > Message: 3 > Date: Mon, 25 Dec 2006 10:01:59 -0500 > From: "John Fox" <jfox@mcmaster.ca> > Subject: Re: [R] extend summary.lm for hccm? > To: "'Achim Zeileis'" <Achim.Zeileis@wu-wien.ac.at> > Cc: r-help@stat.math.ethz.ch, 'ivo welch' <ivowel@gmail.com>, 'Dirk > Eddelbuettel' <edd@debian.org> > Message-ID: > < > 20061225150156.ZBJX17401.tomts10-srv.bellnexxia.net@JohnDesktop8300> > Content-Type: text/plain; charset="us-ascii" > > Dear Achim, > > > -----Original Message----- > > From: Achim Zeileis [mailto:Achim.Zeileis@wu-wien.ac.at] > > Sent: Monday, December 25, 2006 8:38 AM > > To: John Fox > > Cc: 'Dirk Eddelbuettel'; 'ivo welch'; r-help@stat.math.ethz.ch > > Subject: Re: [R] extend summary.lm for hccm? > > > > On Sun, 24 Dec 2006, John Fox wrote: > > > > > Dear Dirk and Ivo, > > > > > > It's true that the sandwich package has more extensive > > facilities for > > > sandwich variance estimation than the hccm function in the car > > > package, but I think that the thrust of Ivo's question is to get a > > > model summary that automatically uses the adjusted standard errors > > > rather than having to compute them and use them "manually." > > > > I've written the function coeftest() in package "lmtest" for > > this purpose. > > With this you can > > coeftest(obj, vcov = sandwich) > > Since coeftest() already exists in a package, I think that this is a > preferable solution. > > > or you can put this into a specialized summary() function as > > John suggested (where you probably would want the F statistic > > to use the other vcov as well). > > Good point. Here's a modified version that also fixes up the F-test: > > summaryHCCM.lm <- function(model, type=c("hc3", "hc0", "hc1", "hc2", > "hc4"), > > ...){ > if (!require(car)) stop("Required car package is missing.") > type <- match.arg(type) > V <- hccm(model, type=type) > sumry <- summary(model) > table <- coef(sumry) > table[,2] <- sqrt(diag(V)) > table[,3] <- table[,1]/table[,2] > table[,4] <- 2*pt(abs(table[,3]), df.residual(model), lower.tail=FALSE) > sumry$coefficients <- table > p <- nrow(table) > hyp <- if (has.intercept(model)) cbind(0, diag(p - 1)) else diag(p) > sumry$fstatistic[1] <- linear.hypothesis(model, hyp, > white.adjust=type)[2,"F"] > print(sumry) > cat("Note: Heteroscedasticity-consistant standard errors using > adjustment", > type, "\n") > } > > Regards, > John > > > See also function waldtest() > > in "lmtest", > > linear.hypothesis() in "car" and > > vignette("sandwich", package = "sandwich") > > > > Although this works, it is still a nuisance to use a > > different function and not summary() directly. In addition, > > it would also be nice to plug in different vcovs into > > confint() or predict() methods. Of course, one could write > > different generics or overload the methods in certain packages. > > However, I guess that many practitioners want to use > > different vcov estimators - especially in the social and > > political scieneces, and econometrics etc. - so that this > > might justify that the original "lm" (and > > "glm") methods are extended to allow for plugging in > > different vcov matrices. Maybe we could try to convince > > R-core to include somthing like this? > > > > Best wishes, > > Z > > > > > > > > > > ------------------------------ > > Message: 4 > Date: Mon, 25 Dec 2006 10:21:55 -0500 > From: steve <fisk@bowdoin.edu> > Subject: [R] Hmisc - some latex problems > To: r-help@stat.math.ethz.ch > Message-ID: <emoqak$6di$1@sea.gmane.org> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > If I use latex with describe: > (faithful is the Old faithful data) > > latex(describe(faithful),file="describe.tex") > > then the first few lines of describe.tex are > > \begin{spacing}{0.7} > \begin{center} \bf faithful \\ 2 Variables~~~~~ 272 ~Observations > \end{center} > > I have two problems. First, I don't know what package > the environment "spacing" comes from. (There is also a command \smaller, > but that is from the relsize package. Perhaps the latex man page could > list the latex packages that are necessary to create correct output.) > > Second, I use the memoir class, and it flags \bf as an error - > \textbf should be used instead. A correct version would be > > > \begin{center}\textbf{ faithful \\ 2 Variables~~~~~ 272 > ~Observations}\end{center} > > thank you, > > Steve > > > > ------------------------------ > > Message: 5 > Date: Mon, 25 Dec 2006 09:50:08 -0600 > From: Frank E Harrell Jr <f.harrell@vanderbilt.edu> > Subject: Re: [R] Hmisc - some latex problems > To: steve <fisk@bowdoin.edu> > Cc: r-help@stat.math.ethz.ch > Message-ID: <458FF330.606@vanderbilt.edu> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > steve wrote: > > If I use latex with describe: > > (faithful is the Old faithful data) > > > > latex(describe(faithful),file="describe.tex") > > > > then the first few lines of describe.tex are > > > > \begin{spacing}{0.7} > > \begin{center} \bf faithful \\ 2 Variables~~~~~ 272 ~Observations > > \end{center} > > > > I have two problems. First, I don't know what package > > the environment "spacing" comes from. (There is also a command \smaller, > > but that is from the relsize package. Perhaps the latex man page could > > list the latex packages that are necessary to create correct output.) > > The latex object created by latex.describe lists the necessary styles - > setspace and relsize - but I'll make that clear in the documentation. > Note for debian linux users - these styles are included in the package > tetex-extra. > > > > > Second, I use the memoir class, and it flags \bf as an error - > > \textbf should be used instead. A correct version would be > > > > > > \begin{center}\textbf{ faithful \\ 2 Variables~~~~~ 272 > > ~Observations}\end{center} > > Thanks - will fix in next release > > Frank > > > > > thank you, > > > > Steve > > > > ------------------------------ > > Message: 6 > Date: Mon, 25 Dec 2006 11:35:17 -0600 > From: Aimin Yan <aiminy@iastate.edu> > Subject: [R] Problem to generate training data set and test data set > To: r-help@stat.math.ethz.ch > Message-ID: > <6.1.2.0.2.20061225112056.01c5c4e0@aiminy.mail.iastate.edu> > Content-Type: text/plain; charset="us-ascii"; format=flowed > > I have a full data set like this: > > aa bas aas bms ams bcu acu omega y > 1 ALA 0 127.71 0 69.99 0 -0.2498560 79.91470 outward > 2 PRO 0 68.55 0 55.44 0 -0.0949008 76.60380 outward > 3 ALA 0 52.72 0 47.82 0 -0.0396550 52.19970 outward > 4 PHE 0 22.62 0 31.21 0 0.1270330 169.52500 inward > 5 SER 0 71.32 0 52.84 0 -0.1312380 7.47528 outward > 6 VAL 0 12.92 0 22.40 0 0.1728390 149.09400 inward > > ...................................................................................... > > > aa have 19 levels, and there are different number of observation for each > levels. > I want to pick 75% of observations of each levels randomly to generate a > training set, > and 25% of observation of each levels to generate a testing set. > > Does anyone know to do this? > > Thanks > > Aimin Yan > > > > ------------------------------ > > Message: 7 > Date: Mon, 25 Dec 2006 15:10:23 -0500 (EST) > From: Miguel Tremblay <miguel.tremblay@ptaff.ca> > Subject: [R] No fonts in graphics using GDD > To: r-help@stat.math.ethz.ch > Message-ID: <Pine.LNX.4.64.0612251508440.14703@octet.ca> > Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed > > I have problem to have fonts in graphics using GDD. > > R version: R 2.2.1 (2005-12-20). > GDD version: GDD_0.1-8.tar.gz > Platform: Linux 2.6.17-gentoo-r8 #2 SMP Sat Nov 4 21:16:47 EST 2006 i686 > Intel(R) Pentium(R) 4 CPU 3.00GHz GenuineIntel GNU/Linux > > > I have configured the basefont.mapping file accordingly to the path of the > font on my computer: > base.norm:/usr/share/fonts/corefonts/arial.ttf > base.ital:/usr/share/fonts/corefonts/ariali.ttf > base.bold:/usr/share/fonts/corefonts/arialbd.ttf > base.bita:/usr/share/fonts/corefonts/arialbi.ttf > > Those files exists and are present at the right place: > ls /usr/share/fonts/corefonts/arial*.ttf > /usr/share/fonts/corefonts/arial.ttf > /usr/share/fonts/corefonts/arialbi.ttf > /usr/share/fonts/corefonts/arialbd.ttf > /usr/share/fonts/corefonts/ariali.ttf > > In R, when I import the GDD library and look at the font path, they are > correct: > > library('GDD') > > .Call("gdd_look_up_font", NULL) > [1] "/usr/share/fonts/corefonts/arial.ttf" > [2] "/usr/share/fonts/corefonts/arialbd.ttf" > [3] "/usr/share/fonts/corefonts/ariali.ttf" > [4] "/usr/share/fonts/corefonts/arialbi.ttf" > [5] NA > > > When I create a graphic, there is no font: > > library(GDD) > > GDD() > > plot(rnorm) > > dev.off() > > I have went through the following threads and it did'nt help: > > http://mailman.rz.uni-augsburg.de/pipermail/stats-rosuda-devel/2006q1/000197.html > http://tolstoy.newcastle.edu.au/R/help/06/04/25159.html > > > The ouptut of my GDD installation is: > R CMD INSTALL GDD_0.1-8.tar.gz > * Installing *source* package 'GDD' ... > checking for gcc... i686-pc-linux-gnu-gcc > checking for C compiler default output file name... a.out > checking whether the C compiler works... yes > checking whether we are cross compiling... no > checking for suffix of executables... > checking for suffix of object files... o > checking whether we are using the GNU C compiler... yes > checking whether i686-pc-linux-gnu-gcc accepts -g... yes > checking for i686-pc-linux-gnu-gcc option to accept ANSI C... none needed > checking how to run the C preprocessor... i686-pc-linux-gnu-gcc -E > checking for egrep... grep -E > checking for ANSI C header files... yes > checking for sys/wait.h that is POSIX.1 compatible... yes > checking for sys/types.h... yes > checking for sys/stat.h... yes > checking for stdlib.h... yes > checking for string.h... yes > checking for memory.h... yes > checking for strings.h... yes > checking for inttypes.h... yes > checking for stdint.h... yes > checking for unistd.h... yes > checking for string.h... (cached) yes > checking sys/time.h usability... yes > checking sys/time.h presence... yes > checking for sys/time.h... yes > checking for unistd.h... (cached) yes > checking for an ANSI C-conforming const... yes > checking whether time.h and sys/time.h may both be included... yes > checking for stdlib.h... (cached) yes > checking for GNU libc compatible malloc... yes > checking return type of signal handlers... void > checking for memset... yes > checking for mkdir... yes > checking for rmdir... yes > checking for gdlib-config... /usr/bin/gdlib-config > libgd-flags according to gdlib-config: > -I/usr/include -L/usr/lib -ljpeg -lfontconfig -lpng12 -lz -lm > if that is not correct, fix your gd installation > checking gd.h usability... yes > checking gd.h presence... yes > checking for gd.h... yes > checking for gdImageCreateFromPng in -lgd... yes > checking usability of FreeType in GD... yes > checking whether GD programs can be compiled... yes > configure: creating ./config.status > config.status: creating src/Makevars > config.status: creating src/gddconfig.h > ** libs > i686-pc-linux-gnu-gcc -I/usr/lib/R/include -I/usr/include -I/usr/include > -DHAVE_GDDCONFIG_H -I. -Iinclude -I/usr/local/include > -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_LARGEFILE_SOURCE -O2 > -march=pentium4 -fomit-frame-pointer -pipe -fPIC -D_FILE_OFFSET_BITS=64 > -D_LARGEFILE64_SOURCE -D_LARGEFILE_SOURCE -O2 -march=pentium4 > -fomit-frame-pointer -pipe -c GDD.c -o GDD.o > i686-pc-linux-gnu-gcc -I/usr/lib/R/include -I/usr/include -I/usr/include > -DHAVE_GDDCONFIG_H -I. -Iinclude -I/usr/local/include > -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_LARGEFILE_SOURCE -O2 > -march=pentium4 -fomit-frame-pointer -pipe -fPIC -D_FILE_OFFSET_BITS=64 > -D_LARGEFILE64_SOURCE -D_LARGEFILE_SOURCE -O2 -march=pentium4 > -fomit-frame-pointer -pipe -c GDDtalk.c -o GDDtalk.o > i686-pc-linux-gnu-gcc -shared -o GDD.so GDD.o GDDtalk.o -L/usr/lib > -L/usr/lib -lgd -ljpeg -lfontconfig -lpng12 -lz -lm -L/usr/lib/R/lib -lR > ** R > ** inst > ** help > >>> Building/Updating help pages for package 'GDD' > Formats: text html latex example > GDD text html latex example > GDDfont text html latex example > ** building package indices ... > * DONE (GDD) > > > > Anyone can help me with this? > > > Miguel Tremblay > http://ptaff.ca/miguel/ > > > > ------------------------------ > > Message: 8 > Date: Mon, 25 Dec 2006 17:08:26 -0500 > From: "Christos Hatzis" <christos@nuverabio.com> > Subject: Re: [R] how to 'get' an object that is part of a list > To: "'Gabor Grothendieck'" <ggrothendieck@gmail.com> > Cc: r-help@stat.math.ethz.ch > Message-ID: > <001401c72871$34007850$0202a8c0@headquarters.silicoinsights> > Content-Type: text/plain; charset="us-ascii" > > True. Thanks again. > > -----Original Message----- > From: Gabor Grothendieck [mailto:ggrothendieck@gmail.com] > Sent: Monday, December 25, 2006 8:34 AM > To: christos@nuverabio.com > Cc: r-help@hypatia.math.ethz.ch > Subject: Re: [R] how to 'get' an object that is part of a list > > my.length.2 also has the advantage of eliminating the eval. > > On 12/25/06, Christos Hatzis <christos@nuverabio.com> wrote: > > Thank you Gabor. Very interesting solution. > > If I get it right, the first argument in function f is just a > > placeholder to help extract the right element out of the list(...) > > that is passed to length. Very smart trick. > > > > Jim's solution appears a bit simpler, at least along the lines that I > > was > > thinking: > > > > my.length <- function(...) { > > names <- as.character(substitute(list(...)))[-1] > > sapply(names, function(x){y <- eval(parse(text=x)); length(y)}) > > } > > > > -Christos > > > > -----Original Message----- > > From: Gabor Grothendieck [mailto:ggrothendieck@gmail.com] > > Sent: Sunday, December 24, 2006 7:41 AM > > To: jim holtman > > Cc: christos@nuverabio.com; > > Subject: Re: [R] how to 'get' an object that is part of a list > > > > Is this what you are looking for: > > > > > my.length.2 <- function(...) { > > + f <- function(nm, val) length(val) > > + mapply(f, make.names(as.list(match.call()[-1])), list(...)) } > > > my.length.2(xx, xx$b) > > xx xx.b > > 2 5 > > > > On 12/24/06, jim holtman <jholtman@gmail.com> wrote: > > > use 'eval' and 'parse' > > > > > > > > > > xx > > > $a > > > [1] 1 2 3 4 5 > > > > > > $b > > > [1] "a" "b" "c" "d" "e" > > > > > > > eval(parse(text='xx$a')) > > > [1] 1 2 3 4 5 > > > > > > > > > > So that way you can pass in the character string and then 'parse' it. > > > > > > > > > > > > On 12/24/06, Christos Hatzis <christos@nuverabio.com> wrote: > > > > > > > > This might be an trivial thing but I am stuck. > > > > > > > > Consider: > > > > > > > > xx <- list(a=1:5, b=letters[1:5]) > > > > > > > > Although object xx is accessible through its name, how can object > > > > xx$b be accessed similarly through its name? > > > > > > > > > get("xx") > > > > $a > > > > [1] 1 2 3 4 5 > > > > > > > > $b > > > > [1] "a" "b" "c" "d" "e" > > > > > > > > > get("xx$b") > > > > Error in get(x, envir, mode, inherits) : variable "xx$b" was not > > > > found > > > > > > > > get("xx")$b will not work in my case because it will probably > > > > require parsing to make it work within a function. E.g. > > > > > > > > my.length <- function(...) { > > > > names <- as.character(substitute(list(...)))[-1] > > > > sapply(names, FUN=function(x){y <- get(x); length(y)}) } > > > > > my.length(xx) > > > > xx > > > > 2 > > > > > my.length(xx$a) > > > > Error in get(x, envir, mode, inherits) : variable "xx$a" was not > > > > found > > > > > my.length(xx$a, xx$b) > > > > Error in get(x, envir, mode, inherits) : variable "xx$a" was not > > > > found > > > > > > > > Thank you. > > > > > > > > Christos Hatzis, Ph.D. > > > > Nuvera Biosciences, Inc. > > > > 400 West Cummings Park > > > > Suite 5350 > > > > Woburn, MA 01801 > > > > Tel: 781-938-3830 > > > > www.nuverabio.com > > > > > > > > ______________________________________________ > > > > R-help@stat.math.ethz.ch mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > > > > > > -- > > > Jim Holtman > > > Cincinnati, OH > > > +1 513 646 9390 > > > > > > What is the problem you are trying to solve? > > > > ------------------------------ > > Message: 9 > Date: Mon, 25 Dec 2006 18:25:33 -0500 (EST) > From: Frank Grex <franco@grex.cyberspace.org> > Subject: [R] Bayesian data mining > To: r-help@stat.math.ethz.ch > Message-ID: <Pine.BSO.4.63.0612251817440.31320@grex.cyberspace.org> > Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed > > Hi, I need a help to know whether I can perform the following in R: > I have a set of observations (Ns) and each observation is drawn from a > poisson distribution with an unkown mean, lambda. The set of lambdas in > their turn are drawn from a common prior distribution which is supposed to > be a a mixture of two gamma distributions. > Is there a way to determine the poisson means in R, given the Ns and the > probabilities? > And how can one determine the two alphas and the two betas of the gamma > mixtures? I am assuming there will be an MLE somewhere. > This is to help me understand and apply William DuMouchel concept of > datamining especially in his article: "Bayesian data mining in large > frequency tables". Thanks > > > > ------------------------------ > > Message: 10 > Date: Mon, 25 Dec 2006 18:31:33 -0500 (EST) > From: Frank Grex <franco@grex.cyberspace.org> > Subject: [R] Bayesian data mining (fwd) > To: r-help@stat.math.ethz.ch > Message-ID: <Pine.BSO.4.63.0612251830460.3501@grex.cyberspace.org> > Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed > > Hi, I need a help to know whether I can perform the following in R: > I have a set of observations (Ns) and each observation is drawn from a > poisson distribution with an unkown mean, lambda. The set of lambdas in > their turn are drawn from a common prior distribution which is supposed to > be a a mixture of two gamma distributions. > Is there a way to determine the poisson means in R, given the Ns and the > probabilities? > And how can one determine the two alphas and the two betas of the gamma > mixtures? I am assuming there will be an MLE somewhere. > This is to help me understand and apply William DuMouchel concept of > datamining especially in his article: "Bayesian data mining in large > frequency tables". Thanks > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > ------------------------------ > > Message: 11 > Date: Tue, 26 Dec 2006 00:03:44 +0000 (GMT) > From: Milton Cezar Ribeiro <milton_ruser@yahoo.com.br> > Subject: [R] defining color sequence in image() > To: R-help <r-help@stat.math.ethz.ch> > Message-ID: <973009.27835.qm@web56608.mail.re3.yahoo.com> > Content-Type: text/plain > > Dear All, > > How can I define a color sequence for each image value? I have several > images with values varying from 1 to 5, and I would like to assing a fixed > color for each value (green for 1, yellow for 2...). I used somethink like: > > > image(img,col=c("green","yellow",...)) > > but unfortunately whem I have absent values, the color that I defined for > each values change. > > Thanks a lot, > > Miltinho > Brazil > > PS : Merry Christmas!!! > > __________________________________________________ > > > [[alternative HTML version deleted]] > > > > ------------------------------ > > Message: 12 > Date: Tue, 26 Dec 2006 11:16:20 +1100 > From: Jim Lemon <jim@bitwrit.com.au> > Subject: Re: [R] Problem to generate training data set and test data > set > To: Aimin Yan <aiminy@iastate.edu> > Cc: r-help@stat.math.ethz.ch > Message-ID: <459069D4.8080103@bitwrit.com.au> > Content-Type: text/plain; charset=us-ascii; format=flowed > > Aimin Yan wrote: > > I have a full data set like this: > > > > aa bas aas bms ams bcu acu omega y > > 1 ALA 0 127.71 0 69.99 0 -0.2498560 79.91470 outward > > 2 PRO 0 68.55 0 55.44 0 -0.0949008 76.60380 outward > > 3 ALA 0 52.72 0 47.82 0 -0.0396550 52.19970 outward > > 4 PHE 0 22.62 0 31.21 0 0.1270330 169.52500 inward > > 5 SER 0 71.32 0 52.84 0 -0.1312380 7.47528 outward > > 6 VAL 0 12.92 0 22.40 0 0.1728390 149.09400 inward > > > ...................................................................................... > > > > > > aa have 19 levels, and there are different number of observation for > each > > levels. > > I want to pick 75% of observations of each levels randomly to generate a > > training set, > > and 25% of observation of each levels to generate a testing set. > > > Hi Aimin, > I haven't tested this exhaustively, but I think it does what you want. > > get.prob.sample<-function(x,prob=0.5) { > xlevels<-levels(as.factor(x)) > xlength<-length(x) > xsamp<-rep(FALSE,xlength) > for(i in xlevels) { > lengthi<-length(x[x == i]) > xsamp[sample(which(x == i),lengthi*prob)]<-TRUE > } > return(xsamp) > } > > get.prob.sample(mydata$aa,0.75) > > Jim > > > > ------------------------------ > > Message: 13 > Date: Tue, 26 Dec 2006 10:59:45 +1000 > From: <Bill.Venables@csiro.au> > Subject: Re: [R] Higher Dimensional Matrices > To: <larsfromspace@web.de>, <r-help@stat.math.ethz.ch> > Message-ID: > <B998A44C8986644EA8029CFE6396A924840A8E@exqld2-bne.qld.csiro.au> > Content-Type: text/plain; charset="us-ascii" > > Lars from space [sic] asks: > > -----Original Message----- > > > From: r-help-bounces@stat.math.ethz.ch > > [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of downunder > > Sent: Monday, 25 December 2006 12:31 PM To: r-help@stat.math.ethz.ch > > Subject: [R] Higher Dimensional Matrices > > > > > > Hi all. > > > > I want to calculate partial correlations while controlling for one > > or more variables. That works already fine when I control for > > example just for x[,1] and x[,2] that gives me one single > > correlation matrix and i have to to it for x [,1]...x[,10]. That > > would give me out 10 matrices. Controlling for 2 Variables 100 > > matrices. how can I run a loop to get f.e the 10 or 100 matrices at > > once? I appreciate for every hint. have nice holidays. > > I don't quite understand this. You have 10 variables and you want to > find the partial correlations controlling for two of them at a time. > If you take each possible set of two variables to condition upon at a > time, this would give you choose(10, 2) = 45 matrices, wouldn't it? > Where do you get '10 or 100' matrices from? > > > > > greetings lars > > > > x <- read.table("C:/.....dat") > > dim(x) #200x10 > > a <- matrix(0,200,10) > > for (i in 1:10) > > a[,i] <- residuals(lm(x[,i]~1+x[,1]+x[,2])) > > b <- matrix(0,200,10) > > for (i in 1:10) > > b[,i] <- residuals(lm(x[,i]~1+x[,1]+x[,2])) > > #a=round(a,5) > > #b=round(b,5) > > d <- cor(a,b) > > d > > But a and b are the same, aren't they? Why do you need to compute > them twice? Why not just use cor(a, a) [which is the same as cor(a), > of course]? > > There is a more serious problem here, though. Your residuals are > after regression on x[, 1:2] so if you *select* x[, 1:2] as the > y-variables in your regression then the residuals are going to be > zero, but in practice round-off error. so the first two rows and > columns of d will be correlations with round-off error, > i.e. misleading junk. It doesn't make sense to include the > conditioning variables in the correlation matrix *conditioning on > them*. Only the 8 x 8 matrix of the others among themselves is > defined, really. > > So how do we do it? Here are a few pointers. > > To start, here is a function that uses a somewhat more compact way of > finding the partial correlations than your method. Sorting out how it > works should be a useful exercise. > > partialCorr <- function (cond, mat) > cor(qr.resid(qr(cbind(1, mat[, cond])), mat[, -cond])) > > To find the matrix of partial correlations conditioning on x[, 1:2] > you would use > > d <- partialCorr(c(1,2), x) > > So how to do it for all possible conditioning pairs of variables. > Well you could do it in an obvious loop: > > cmats <- array(0, c(8,8,45)) > k <- 0 > for(i in 1:9) for(j in (i+1):10) { > k <- k+1 > cmats[, , k] <- partialCorr(c(i, j), x) > } > > Now the results are in a 3-way array, but without any kind of labels. > Perhaps you should think about how to fix that one yourself... > > With more than two conditioning variables you should probably use a > function to generate the subsets of the appropriate size rather than > trying to write ever more deeply nested loops. There are plenty of > functions around to do this. > > Bill Venables. > > > > ------------------------------ > > Message: 14 > Date: Tue, 26 Dec 2006 00:07:57 -0500 > From: "Pedro Mardones" <mardones.p@gmail.com> > Subject: [R] sequential row selection in dataframe > To: R-help@stat.math.ethz.ch > Message-ID: > <83dca7860612252107j60f4836aq6cf3debc1dec4045@mail.gmail.com> > Content-Type: text/plain; charset=UTF-8; format=flowed > > Dear all; > > I'm wondering if there is any 'efficient' approach for selecting a > sample of 'every nth rows' from a dataframe. For example, let's use > the dataframe GAGurine in MASS library: > > > length(GAGurine[,1]) > [1] 314 > > # select an 75% of the dataset, i.e. = 236 rows, every 2 rows starting > from row 1 > > test<-GAGurine[seq(1,314,2),] > > length(test[,1]) > [1] 157 > > # so, I still need another 79 rows, one way could be: > test2<-GAGurine[-seq(1,314,2),] > > length(test2[,1]) > [1] 157 > > test3<-test2[seq(1,157,2),] > > # and then > final<-rbind(test2,test3) > > length(final[,1]) > [1] 236 > > Does anyone have a better idea to get the same results but without > creating different datasets like test2 and test3? > > Thanks > PM > > > > ------------------------------ > > Message: 15 > Date: Mon, 25 Dec 2006 22:15:08 -0800 (PST) > From: downunder <larsfromspace@web.de> > Subject: Re: [R] Higher Dimensional Matrices > To: r-help@stat.math.ethz.ch > Message-ID: <8051407.post@talk.nabble.com> > Content-Type: text/plain; charset=us-ascii > > > Hi Bill. Thanks for your extensive hints especially with arrays. > That solves my problem now and I am also able to control for every > combination of variables. > > Have a merry christmas. lars > > > > > > x <- read.table("C:/.....dat") > > dim(x) #200x10 > > a <- matrix(0,200,10) > > for (i in 1:10) > > a[,i] <- residuals(lm(x[,i]~1+x[,1]+x[,2])) > > b <- matrix(0,200,10) > > for (i in 1:10) > > b[,i] <- residuals(lm(x[,i]~1+x[,1]+x[,2])) > > #a=round(a,5) > > #b=round(b,5) > > d <- cor(a,b) > > d > > But a and b are the same, aren't they? Why do you need to compute > them twice? Why not just use cor(a, a) [which is the same as cor(a), > of course]? > > There is a more serious problem here, though. Your residuals are > after regression on x[, 1:2] so if you *select* x[, 1:2] as the > y-variables in your regression then the residuals are going to be > zero, but in practice round-off error. so the first two rows and > columns of d will be correlations with round-off error, > i.e. misleading junk. It doesn't make sense to include the > conditioning variables in the correlation matrix *conditioning on > them*. Only the 8 x 8 matrix of the others among themselves is > defined, really. > > So how do we do it? Here are a few pointers. > > To start, here is a function that uses a somewhat more compact way of > finding the partial correlations than your method. Sorting out how it > works should be a useful exercise. > > partialCorr <- function (cond, mat) > cor(qr.resid(qr(cbind(1, mat[, cond])), mat[, -cond])) > > To find the matrix of partial correlations conditioning on x[, 1:2] > you would use > > d <- partialCorr(c(1,2), x) > > So how to do it for all possible conditioning pairs of variables. > Well you could do it in an obvious loop: > > cmats <- array(0, c(8,8,45)) > k <- 0 > for(i in 1:9) for(j in (i+1):10) { > k <- k+1 > cmats[, , k] <- partialCorr(c(i, j), x) > } > > Now the results are in a 3-way array, but without any kind of labels. > Perhaps you should think about how to fix that one yourself... > > With more than two conditioning variables you should probably use a > function to generate the subsets of the appropriate size rather than > trying to write ever more deeply nested loops. There are plenty of > functions around to do this. > > Bill Venables. > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > -- > View this message in context: > http://www.nabble.com/Re%3A--R--Higher-Dimensional-Matrices-tf2880928.html#a8051407 > Sent from the R help mailing list archive at Nabble.com. > > > > ------------------------------ > > Message: 16 > Date: Tue, 26 Dec 2006 06:41:16 +0000 (GMT) > From: justin bem <justin_bem@yahoo.fr> > Subject: [R] Merry Xmas > To: R Maillist <r-help@stat.math.ethz.ch> > Message-ID: <20061226064116.53756.qmail@web23014.mail.ird.yahoo.com> > Content-Type: text/plain > > Merry Xmas to U all and happy New year 2007. > > Justin BEM > El�ve Ing�nieur Statisticien Economiste > BP 294 Yaound�. > T�l (00237)9597295. > > > > > > > > > > ___________________________________________________________________________ > > interface r�volutionnaire. > > [[alternative HTML version deleted]] > > > > ------------------------------ > > Message: 17 > Date: Tue, 26 Dec 2006 17:42:18 +0900 > From: talepanda <talepanda@gmail.com> > Subject: Re: [R] defining color sequence in image() > To: R-help <r-help@stat.math.ethz.ch> > Message-ID: > <3948d9e50612260042l49c52176teb4996fdb6ebe585@mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > one solution is: > > img1<-matrix(1:5) > img2<-matrix(2:5) > col<-1:5 # col<-c("green","yellow",...) > image(img1,col=col[sort(unique(img1))]) > image(img2,col=col[sort(unique(img2))]) > > On 12/26/06, Milton Cezar Ribeiro <milton_ruser@yahoo.com.br> wrote: > > Dear All, > > > > How can I define a color sequence for each image value? I have several > > images with values varying from 1 to 5, and I would like to assing a > fixed > > color for each value (green for 1, yellow for 2...). I used somethink > like: > > > > > image(img,col=c("green","yellow",...)) > > > > but unfortunately whem I have absent values, the color that I defined > for > > each values change. > > > > Thanks a lot, > > > > Miltinho > > Brazil > > > > PS : Merry Christmas!!! > > > > __________________________________________________ > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > ------------------------------ > > Message: 18 > Date: Mon, 25 Dec 2006 17:13:46 -0600 > From: "Geoffrey Zhu" <gzhu@peak6.com> > Subject: [R] HOW TO: vectorize an iterative process. > To: <r-help@stat.math.ethz.ch> > Message-ID: > <99F81FFD0EA54E4DA8D4F1BFE272F3410821BE@ppi-mail1.chicago.peak6.net > > > Content-Type: text/plain; charset="utf-8" > > SGkgRXZlcnlvbmUsDQogDQpJIGFtIHN0dWNrIHdpdGggYSBzaW1wbGUgcHJvYmxlbS4gU3Vw > cG9zZSBJIGhhdmUgYSB2ZWN0b3IgeCwgYW5kIEkgd2FudCB0byBjYWxjdWxhdGUgeVtpXT14 > W2krMV0teFtpXSwgaXQgaXMgdmVyeSBlYXN5LiBJIGp1c3QgbmVlZCB0byB3cml0ZSB5PC14 > WzI6bGVuZ3RoKHgpXS14WzE6bGVuZ3RoKHgpLTFdLiANCiANCk5vdyBpZiBJIGtub3cgeSwg > YW5kIHdhbnQgdG8ga25vdyB0aGUgdmVjdG9yIHggZGVmaW5lZCBieSB4W2ldPXhbaS0xXSt5 > W2ktMV0gZm9yIGFsbCBpLCBob3cgY2FuIEkgZG8gdGhpcyB3aXRob3V0IGEgbG9vcD8gDQog > DQpUaGFua3MsDQpHZW9mZnJleQ0KIA0KDQpfX19fX19fX19fX19fX19fX19fX19fX19fX19f > X19fX19fX19fX19fX19fX19fX19fX19fX19fCgoKVGhlIGluZm9ybWF0aW9uIGluIHRoaXMg > ZW1haWwgb3IgaW4gYW55IGZpbGUgYXR0YWNoZWQgaGVyZXRvIGlzCmludGVuZGVkIG9ubHkg > Zm9yIHRoZSBwZXJzb25hbCBhbmQgY29uZmlkZW50aWFsIHVzZSBvZiB0aGUgaW5kaXZpZHVh > bApvciBlbnRpdHkgdG8gd2hpY2ggaXQgaXMgYWRkcmVzc2VkIGFuZCBtYXkgY29udGFpbiBp > bmZvcm1hdGlvbiB0aGF0IGlzCnByb3ByaWV0YXJ5IGFuZCBjb25maWRlbnRpYWwuIElmIHlv > dSBhcmUgbm90IHRoZSBpbnRlbmRlZCByZWNpcGllbnQgb2YKdGhpcyBtZXNzYWdlIHlvdSBh > cmUgaGVyZWJ5IG5vdGlmaWVkIHRoYXQgYW55IHJldmlldywgZGlzc2VtaW5hdGlvbiwKZGlz > dHJpYnV0aW9uIG9yIGNvcHlpbmcgb2YgdGhpcyBtZXNzYWdlIGlzIHN0cmljdGx5IHByb2hp > Yml0ZWQuIFRoaXMgY29tbXVuaWNhdGlvbiBpcyBmb3IgaW5mb3JtYXRpb24gcHVycG9zZXMg > b25seSBhbmQgc2hvdWxkIG5vdCBiZSByZWdhcmRlZCBhcyBhbiBvZmZlciB0byBzZWxsIG9y > IGFzIGEgc29saWNpdGF0aW9uIG9mIGFuIG9mZmVyIHRvIGJ1eSBhbnkgZmluYW5jaWFsIHBy > b2R1Y3QuIEVtYWlsIHRyYW5zbWlzc2lvbiBjYW5ub3QgYmUgZ3VhcmFudGVlZCB0byBiZSBz > ZWN1cmUgb3IgZXJyb3ItZnJlZS4NCg== > > > > ------------------------------ > > Message: 19 > Date: Mon, 25 Dec 2006 13:36:50 +0100 > From: "Sebastian Michalski" <s_michalski@o2.pl> > Subject: [R] writing to S-PLUS .dat file > To: <r-help@stat.math.ethz.ch> > Message-ID: <005001c72821$59e96bd0$51cf1251@smicha> > Content-Type: text/plain; format=flowed; charset="iso-8859-2"; > reply-type=original > > Dear Users, > I am new to R. I use write() to write my data in .txt format. I'd like > to write to a disc any kind of data in a .dat S-PLUS format. > Please help. > SM > > > > ------------------------------ > > Message: 20 > Date: Mon, 25 Dec 2006 14:57:04 +0100 (CET) > From: Achim Zeileis <Achim.Zeileis@wu-wien.ac.at> > Subject: Re: [R] extend summary.lm for hccm? > To: John Fox <jfox@mcmaster.ca> > Cc: r-help@stat.math.ethz.ch, 'ivo welch' <ivowel@gmail.com>, 'Dirk > Eddelbuettel' <edd@debian.org> > Message-ID: > <Pine.LNX.4.44.0612251439200.8229-100000@disco.wu-wien.ac.at> > Content-Type: TEXT/PLAIN; charset=US-ASCII > > On Sun, 24 Dec 2006, John Fox wrote: > > > If I remember Freedman's recent paper correctly, he argues that sandwich > > variance estimator, though problematic in general, is not problematic in > the > > case that White described -- an otherwise correctly specified linear > model > > with heteroscedasticity estimated by least-squares. > > More generally, sandwich-type estimators are valid (i.e., estimate > the right quantity, although not necessarily precisely, as Frank pointed > out) in situations where the estimating functions are correctly specified > but remaining aspets of the likelihood (not captured in the estimating > functions) are potentially not. > > In linear models, it is easy to see what this means: the mean function has > to be correctly specified (i.e., the errors have zero mean) but the > correlation structure of the errors (i.e., their (co-)variances) might > differ from the usual assumptions. In GLMs, in particular logistic > regression, it is much more difficult to see against which types of > misspecification sandwich-based inference is robust. > > Freedman's paper stresses the point that many model misspecifications also > imply misspecified estimating functions (and in his example this is rather > obvious) so that consequently the sandwich-type estimators estimate the > wrong quantity. > > Best wishes, > Z > > > > ------------------------------ > > _______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > End of R-help Digest, Vol 46, Issue 26 > ************************************** >
-- Om Prakash Research Student, Bioinformatics Center University of Pune Pune-411007 India [[alternative HTML version deleted]]Received on Wed Dec 27 03:42:55 2006______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.
Archive generated by hypermail 2.1.8, at Thu 28 Dec 2006 - 07:30:27 GMT.
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.