Re: [R] FW: average replicate probe values

From: jim holtman <jholtman_at_gmail.com>
Date: Wed, 23 Jul 2008 20:15:42 -0400

Here is one way to do it:

> y <- textConnection("UNIQID UniGene Gene 1_SL 2_SL 17_SL 18_SL 38_SL

+ 1175390 Hs.10095 MLLT1 -0.00595 0.62315 0.85315 1.11215 -0.195
+ 1175392 Hs.10101 C1orf166 -0.4945 -0.04025 0.1299 -0.00575 -0.1824
+ 1187428 Hs.101014 CEP57 0.60085 0.2564 -0.42885 -0.57635 -0.14735
+ 1193447 Hs.101014 CEP57 -0.15625 -0.1681 -0.4891 -0.29995 NA
+ 1173756 Hs.1011 PROZ -0.7211 -0.68895 0.4651 0.30815 0.1133")

> x <- read.table(y, header=TRUE)
> closeAllConnections()
> # split and then aggregate so we can carry through some data
> z <- split(x, x$UniGene)
> z.l <- lapply(z, function(.data){
+     .agg <- colMeans(.data[, c(1,4:8)], na.rm=TRUE)
+     data.frame(.data[1, 2], .data[1, 3], lapply(.agg, unlist))
+ })

> do.call(rbind, z.l)
.data.1..2. .data.1..3. UNIQID X1_SL X2_SL X17_SL X18_SL X38_SL Hs.10095 Hs.10095 MLLT1 1175390 -0.00595 0.62315 0.853150 1.11215 -0.19500 Hs.10101 Hs.10101 C1orf166 1175392 -0.49450 -0.04025 0.129900 -0.00575 -0.18240 Hs.101014 Hs.101014 CEP57 1190438 0.22230 0.04415 -0.458975 -0.43815 -0.14735 Hs.1011 Hs.1011 PROZ 1173756 -0.72110 -0.68895 0.465100
0.30815 0.11330
>
>

On Wed, Jul 23, 2008 at 5:08 PM, Kaposi-Novak, Pal <kaposinovakp_at_upmc.edu> wrote:

>

> ________________________________________
> From: Kaposi-Novak, Pal
> Sent: Wednesday, July 23, 2008 5:07 PM
> To: jim holtman
> Subject: RE: [R] average replicate probe values
>

> Dear Dr Holtman,
>

> Thank you very much for your response.
>

> What I want is avarege data points in a data.frame from probes which represent the same gene (ie have the same UniGene ID).
>

> For example in the table below probe sets in rows 3 and 4 both represent the CEP57 gene.
>

> UNIQID UniGene Gene 1_SL 2_SL 17_SL 18_SL 38_ SL
> 1175390 Hs.10095 MLLT1 -0.00595 0.62315 0.85315 1.11215 -0.195
> 1175392 Hs.10101 C1orf166 -0.4945 -0.04025 0.1299 -0.00575 -0.1824
> 1187428 Hs.101014 CEP57 0.60085 0.2564 -0.42885 -0.57635 -0.14735
> 1193447 Hs.101014 CEP57 -0.15625 -0.1681 -0.4891 -0.29995 NA
> 1173756 Hs.1011 PROZ -0.7211 -0.68895 0.4651 0.30815 0.1133
>

> I would like to make R find the matching UniGene IDs and average expression values for each sample.
> The result would look like the table below:
>

> UNIQID UniGene Gene 1_SL 2_SL 17_SL 18_SL 38_ SL
> 1175390 Hs.10095 MLLT1 -0.00595 0.62315 0.85315 1.11215 -0.195
> 1175392 Hs.10101 C1orf166 -0.4945 -0.04025 0.1299 -0.00575 -0.1824
> 1199466 Hs.101014 CEP57 0.2223 0.04415 -0.458975 -0.43815 -0.14735
> 1173756 Hs.1011 PROZ -0.7211 -0.68895 0.4651 0.30815 0.1133
>

> I am sorry for the naivness of my question, but I am not a trained biostatistician just need to analyze data.
>

> Sincerely,
>

> Pal Kaposi-Novak MD PhD
> PIRT Fellow
> University of Pittsburgh
> Department of Pathology
> BST S408, 200 Lothrop Str
> Pittsburgh, PA , 15261
> Tel: (412) 383-7748
> kaposinovakp_at_umpc.edu
> ________________________________________
> From: jim holtman [jholtman_at_gmail.com]
> Sent: Wednesday, July 23, 2008 7:15 AM
> To: Kaposi-Novak, Pal
> Cc: r-help_at_r-project.org
> Subject: Re: [R] average replicate probe values
>

> It would be helpful if you included a sample of the data so that we
> could understand what you would like to do with it (before/after
> pictures).
>

> ?aggregate
>

> On Tue, Jul 22, 2008 at 9:57 PM, Kaposi-Novak, Pal
> <kaposinovakp_at_upmc.edu> wrote:
>> Hi,
>>
>> Could somebody tell me how I can average expression values of replicate probe sets in an data frame?
>>
>> Thanks
>>
>> Pal Kaposi-Novak MD PhD
>> PIRT Fellow
>> University of Pittsburgh
>> Department of Pathology
>> BST S408, 200 Lothrop Str
>> Pittsburgh, PA , 15261
>> Tel: (412) 383-7748
>> kaposinovakp_at_umpc.edu
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>

> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>

> What is the problem you are trying to solve?
>

> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 24 Jul 2008 - 00:21:34 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 24 Jul 2008 - 01:32:34 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive