Re: [R] keep track of selected observations over time

From: Petr Pikal <petr.pikal_at_precheza.cz>
Date: Thu 25 Jan 2007 - 14:24:27 GMT

Hi

On 24 Jan 2007 at 18:26, Jenny persson wrote:

Date sent:      	Wed, 24 Jan 2007 18:26:17 +0100 (CET)
From:           	Jenny persson <jenny197806@yahoo.se>
To:             	r-help@stat.math.ethz.ch
Subject:        	Re: [R] keep track of selected observations over time

>
> Thanks Peter, I forgot that the mailinglist only accept the pdf and
> ps. file.
>

Well, not sure if I understand what you want (your example is not reproducible), but a boxplot beside of making actual plot also returns invisibly a structure for plotting, actually a list. So you can call

b.str<- boxplot(.....)

and if you go through b.str you can find

$out and $group values, which indicate outliers. Then you can do e.g.

sel <- your.data[,2] %in% b.str$out[ddd$group == 1]

to get logical vector which points are marked as outliers in your first column.Then

your.data[sel,]

gives you appropriate rows which you can use e.g. for adding lines

lines(1:4, your.data[sel,][1,-1])

HTH
Petr

> Here is my problem again:
>
> Attached is a graph of four boxplots from one patient s data at
> four time points, i.e. each boxplot presents the data at each time
> point. At day 0 there are 5 extreme values from five peptide
> sequences (please see the below data). Since the response is
> changing over time, some of these five extreme values at day 0 may
> be lower or higher at day 56, 112 and 252. How can I trace the
> location of each peptide sequence that has extreme value at day 0 on
> the box plots at day 56, 112 and 252 by color or number coding. For
> example, the most five responding peptides can be ranked from 5
> (highest value) to 1 (lowest), so if I do the graph again I would
> see the five extreme values at day 0 as numbers 5-1 and each of
> these numbers can be any where on the box plot at day 56, 112 and
> 252 or instead of the rank numbers using the peptide sequence that
> corresponds to each value. Alternatively, the locations of each
> peptide sequence at the four time points could be linked by a line.
> I would like to repeat this procedure for time point 56, 112 and 252
> as well. That is, at day 56, I want trace where on the box plot at
> day 0, 112 and 252, each of the four peptide sequences that have
> highest responses is. Again, these four values/peptides can be
> presented by different colors, numbers or their peptide sequences
> that distinguish them from the other most responding peptides at day
> 0, 112 and 252. Can I do the four procedures at the same time, I
> mean, if at each time point I want to keep track of where the most
> peptide responses from this time point are, then the total number of
> peptides at the four time points could be 20. That is for each box
> plot, there will be 20 id numbers corresponding to each peptide at
> respectively time point. The graph can be kind of messy.
>
> I have a simple solution of how to see the most responding peptides
> changing over time, by plotting each peptide s responses at the
> four time points. But I haven t managed the procedure above. If you
> have any suggestion how I can do this in R, I would be very
> thankful.
>
> Many thanks
> Jenny
>
>
>
> Part of the data:
>
> > pat1[1:20,]
> peptides P1_D0 P1_D56 P1_D112 P1_D252
> 1 AAAKKGSEQTLKS -0.06181601 -0.12610877 -0.057898384 -0.02126862
> 2 AAAAPASEDEDDE -0.10972387 -0.17174722 -0.136468783 -0.16262501
> 3 AAAAVSSAKRSLR 0.64156129 1.02630879 0.079891841 0.29757984
> 4 AAAKKGSEQESVK -0.54943062 -0.34311337 -0.338910367 -0.14526498
> 5 AAANLTKIVAYLG -1.72207627 -1.63326368 -0.459839317 -0.63302448
> 6 AACGRISYNDMFE 0.52513671 0.65123495 1.151866644 1.49481479
> 7 AAEAEKAASESLR -0.69366543 -0.47038765 -0.144156174 -0.16042556
> 8 AAEHAQSCRSSAA -0.13373130 -0.09229543 -0.102485597 -0.09782440
> 9 AAERHARLNDSYRLQ -0.19316423 -0.33164239 -0.033764989 -0.11734969
> 10 AAETISAARALPS -0.49632307 -0.53666696 -0.263024663 -0.18231712
> 11 AAEVQRFNRDVDETI -0.80014439 -0.91002202 -0.257201702 -0.12391146
> 12 AAEWTANVTAEFK -0.41544438 -0.10980658 -0.288133150 -0.32022460
> 13 AAGIQWSEEETED -0.04015673 0.08529726 0.002471231 0.07599156
> 14 AAGPALSPVPPVV -0.26795462 -0.36739148 -0.512049278 -0.25449224
> 15 AAGPPPSEGEEST -1.59272674 -1.69729759 -0.843351943 -0.49271773
> 16 AAKIASRQPDSHI -0.40722382 -0.27236225 -0.224539441 -0.32998813
> 17 AAKIQASFRGHMA 2.41234976 2.84435484 0.160852331 0.80197802
> 18 AALDLGGSSDPYV -1.21202038 -1.25109705 -0.259515922 -0.24351352
> 19 AALEPGPSESLTA -2.00256570 -1.57566020 -0.390584034 -0.23682626
> 20 AALLELWELRRQQYE 1.42797600 1.33539104 1.486154861 1.67471189
>
>
> par(las=1) # all axis labels horizontal
> boxplot(data.frame(pat1[,c(2:5)]), pars = list(boxwex = 0.4,
> staplewex = 0.8, outwex = .5),
> boxfill="lightblue",border=c(3:6), names=c("Day 0",
> "56 days","112 days", "252 days"),
> col.main="blue",
> main ="Averaged peptide response at 4 different
> time points for patient 200001",cex.main=0.9,
> font.main=0.9)
>
>
>
>
>
> Peter Konings <peter.l.e.konings@gmail.com> skrev:
> Dear Jenny,
>
> Your post did not have an attachment. The mailing list software strips
> most attachments away: see the 'technical details of posting' section
> of the posting guide at: http://www.r-project.org/posting-guide.html.
>
> HTH
> Peter.
>
> On 1/24/07, Jenny persson < jenny197806@yahoo.se > wrote: Dear
> all,
>
> Attached is a description of my data, graph and the problem which I
> need help with. Hope you have time to open the file and help me out.
>
> Many thanks,
> Jenny
>
>
> ---------------------------------
>
>
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.
>
>
>
>
>
>
>
> ---------------------------------
>
>

Petr Pikal
petr.pikal@precheza.cz



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri Jan 26 01:36:57 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 25 Jan 2007 - 15:30:30 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.