Re: [R] a question about data manipulation

From: jim holtman <jholtman_at_gmail.com>
Date: Wed 03 Aug 2005 - 01:56:18 EST

use 'split'

> x.1 <- data.frame(COL1=1:50, COL2=50:1, id=sample(1:4,50,T))
> x.2 <- split(x.1, x.1$id)
> str(x.2)

List of 4
 $ 1:`data.frame': 10 obs. of 3 variables:

  ..$ COL1: int [1:10] 5 10 11 12 22 24 27 34 38 47
  ..$ COL2: int [1:10] 46 41 40 39 29 27 24 17 13 4
  ..$ id  : int [1:10] 1 1 1 1 1 1 1 1 1 1
 $ 2:`data.frame':      13 obs. of  3 variables:
  ..$ COL1: int [1:13] 1 2 14 16 19 25 26 28 30 31 ...
  ..$ COL2: int [1:13] 50 49 37 35 32 26 25 23 21 20 ...   ..$ id : int [1:13] 2 2 2 2 2 2 2 2 2 2 ...  $ 3:`data.frame': 14 obs. of 3 variables:
  ..$ COL1: int [1:14] 3 8 9 13 17 23 32 36 39 42 ...
  ..$ COL2: int [1:14] 48 43 42 38 34 28 19 15 12 9 ...
  ..$ id  : int [1:14] 3 3 3 3 3 3 3 3 3 3 ...
 $ 4:`data.frame':      13 obs. of  3 variables:
  ..$ COL1: int [1:13] 4 6 7 15 18 20 21 29 35 37 ...
  ..$ COL2: int [1:13] 47 45 44 36 33 31 30 22 16 14 ...   ..$ id : int [1:13] 4 4 4 4 4 4 4 4 4 4 ...
> names(x.2)

[1] "1" "2" "3" "4"
> x.2[['1']]

   COL1 COL2 id
5 5 46 1
10 10 41 1
11 11 40 1
12 12 39 1
22 22 29 1
24 24 27 1
27 27 24 1
34 34 17 1
38 38 13 1
47 47 4 1
> x.2[['3']]

   COL1 COL2 id

3     3   48  3
8     8   43  3
9     9   42  3

13 13 38 3
17 17 34 3
23 23 28 3
32 32 19 3
36 36 15 3
39 39 12 3
42 42 9 3
44 44 7 3
45 45 6 3
49 49 2 3
50 50 1 3
>

On 8/2/05, qi zhang <shellyzhang77@gmail.com> wrote:
> Dear R-user,
> I have a simple question, I just can't figure out a easy way to handle it.
> My importing data x is like this:
> COL1 COL2 id
> 1 12 49 1
> 2 70 120 1
> 3 58 124 1
> 51 14 13 2
> 52 88 100 2
> 53 90 134 2
> I want to change the format of the data, i want to group data into
> differenct part according id,so that when i use x[1], which will refer me to
> the information about first id.I use the command:
>
> list(list(N=2,n=c(100,150),matrix(c(x[x$id==1,][,1],x[x$id==1,][,2]),nr=2,nc=3)),list(N=2,n=c(100,150),matrix(c(x[x$id==2,][,1],x[x$id==2,][,2]),nr=2,nc=3)))
>
> so the data becomes :
>
> [[1]]
> [[1]]$N
> [1] 2
>
> [[1]]$n
> [1] 100 150
>
> [[1]][[3]]
> [,1] [,2] [,3]
> [1,] 12 58 120
> [2,] 70 49 124
>
>
> [[2]]
> [[2]]$N
> [1] 2
>
> [[2]]$n
> [1] 100 150
>
> [[2]][[3]]
> [,1] [,2] [,3]
> [1,] 14 90 100
> [2,] 88 13 134
>
> This is the format I want, but problem is that for my data, id is not only 1
> to 2,but 1 to 100, so my code is not efficient. Could you help me find a
> efficient way? Thanks.
>
> Qi Zhang
>
> PhD student,
>
> University of Cincinnati
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Jim Holtman
Convergys
+1 513 723 2929

What the problem you are trying to solve?

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Wed Aug 03 02:10:24 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 15:01:40 EST