Re: [R] merging data list in to single data frame

From: Umesh Rosyara <rosyara_at_msu.edu>
Date: Mon, 04 Apr 2011 16:10:40 -0400

Thank you Dennis for the solution. It is a step ahead..However I need to read all 200 files as dataframes one-by-one. Can we automate this process. I used the following step to read all file at once however the data_list ended as list.  

filelist = list.files(pattern = "K*cd.txt") # the file names are K1cd.txt

.................to K200cd.txt

data_list <-lapply(filelist, read.table, header=T, comment=";", fill=T) names(filelist) <- 1:length(filelist)

    library("plyr")

    ldply(data_list, rbind)  

I tried to use your approach to list, is not successful to have the var .id (otherwise it is binding the component dataframes !), probably this is applicable to component data frames not list with many data frames.

Do you any suggestion on using fuctions that can read the files (as I did above) and save as new dataframe (for example DF1.....DF2) not a list of 200 data frames? If we can do that then we will able to use this approach.  

Thank you so much,

Umesh R  

From: Dennis Murphy [mailto:djmuser_at_gmail.com] Sent: Monday, April 04, 2011 3:25 PM
To: Umesh Rosyara
Cc: r-help_at_r-project.org; rosyaraur_at_gmail.com Subject: Re: [R] merging data list in to single data frame  

Hi:

Here's an alternative using ldply() from the plyr package. The idea is to read the data frames into a list, name them accordingly and then call ldply().

# Read in the test data frames (you want to use list.files() instead to input the data per Uwe's guidelines)
df1 <- read.table(textConnection("
+ var1 var2 var3 var4
+ 1 6 0.3 8
+ 3 4 0.4 9
+ 2 3 0.4 6
+ 1 0.4 0.9 3"), header = TRUE)
> df2 <- read.table(textConnection("
+ var1 var2 var3 var4
+ 1 16 0.6 7
+ 3 14 0.4 6
+ 2 13 0.4 5
+ 1 0.6 0.9 2"), header = TRUE)
closeAllConnections()
# generate the list
dl <- list(df1, df2)

# Name the list components by number and then call ldply(): names(dl) <- 1:2 # more generally, names(dl) <- 1:length(dl) library("plyr")
ldply(dl, rbind)
  .id var1 var2 var3 var4

1   1    1  6.0  0.3    8
2   1    3  4.0  0.4    9
3   1    2  3.0  0.4    6
4   1    1  0.4  0.9    3
5   2    1 16.0  0.6    7
6   2    3 14.0  0.4    6
7   2    2 13.0  0.4    5
8   2    1  0.6  0.9    2

You can always change .id to fileno afterwards.

HTH,
Dennis

On Mon, Apr 4, 2011 at 7:41 AM, Umesh Rosyara <rosyara_at_msu.edu> wrote:

Dear R community members

I did find a good way to merge my 200 text data files in to a single data file with one column added will show indicator for that file.

filelist = list.files(pattern = "K*cd.txt") # the file names are K1cd.txt

.................to K200cd.txt

data_list <-lapply(filelist, read.table, header=T, comment=";", fill=T)

This will create list, but this is not what I want.

I want a single dataframe (all separate dataframes have same variable headings) with additional row for example

; just for example, two small datasets are created by my component datasets are huge, need automation

;read from file K1cd.txt

var1 var2 var3 var4

1 6 0.3 8

3 4 0.4 9

2 3 0.4 6

1 0.4 0.9 3

;read from file K2cd.txt

var1 var2 var3 var4

1 16 0.6 7

3 14 0.4 6

2 1 3 0.4 5

1 0.6 0.9 2

the output dataframe should look like

Fileno var1 var2 var3 var4

1 1 6 0.3 8

1 3 4 0.4 9

1 2 3 0.4 6

1 1 0.4 0.9 3

2 1 16 0.6 7

2 3 14 0.4 6

2 2 1 3 0.4 5

2 1 0.6 0.9 2

Please note that new file no column is added

Thank you for the help.

Umesh R

       [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.  

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 04 Apr 2011 - 20:17:55 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 17 Apr 2011 - 06:10:31 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive