Re: [R] Parsing a data file - Help

From: Chuck Cleland <ccleland_at_optonline.net>
Date: Wed, 11 Jun 2008 14:38:42 -0400

On 6/11/2008 1:29 PM, A Ezhil wrote:
> Hi All,
>
> I have the data in the following format:
>
> idkt saap lahto pidg
> 5266 19911111 19911114 3078A
> 5266 19921005 19921030 2968A
> 6666 19930208 19930209 3074A
> 6666 20020329 20020402 F322
> 6666 20020402 20020409 F322
> 6866 19810713 19810917 29800
> 6866 19811109 19811120 29550
> 6866 19820203 19820219 29550
>
> I would like to parse the data and reformat into a single row for each unique idkt, something like:
> 5266 19911111 19911114 3078A 19921005 19921030 2968A
>
> I have tried with
>
> f <- read.table("file.txt", sep="\t", header=TRUE);
> attach(f);
> fac <- factor(f[,1]);
> id <- matrix(length(fac), 4);
> for(i in fac) id[i] <- f[idkt %in% fac[i], ];
>
> I am not able make the list id into a single row. Could you please help how I can do this?

   If you can create a variable that differentiates multiple records from the same idkt, you can use reshape() like this:

f <- "idkt saap lahto pidg
5266 19911111 19911114 3078A
5266 19921005 19921030 2968A
6666 19930208 19930209 3074A
6666 20020329 20020402 F322
6666 20020402 20020409 F322
6866 19810713 19810917 29800
6866 19811109 19811120 29550
6866 19820203 19820219 29550"

fdata <- read.table(textConnection(f), sep=" ", header=TRUE)

fdata$time <- unlist(lapply(table(fdata$idkt), function(x){1:x}))

reshape(fdata, idvar = "idkt", timevar = "time", direction="wide")

   idkt saap.1 lahto.1 pidg.1 saap.2 lahto.2 pidg.2 saap.3 lahto.3 pidg.3
1 5266 19911111 19911114 3078A 19921005 19921030 2968A NA NA <NA>
3 6666 19930208 19930209 3074A 20020329 20020402 F322 20020402 20020409 F322
6 6866 19810713 19810917 29800 19811109 19811120 29550 19820203 19820219 29550

> Thanks in advance.
>
> Kind regards,
> Ezhil
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 11 Jun 2008 - 19:10:11 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 11 Jun 2008 - 20:32:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive