[R] functionality of "update" in SAS

From: Denis Chabot <chabotd_at_globetrotter.net>
Date: Wed 20 Sep 2006 - 19:42:36 GMT

Dear list,

I've tried to search the archives but found nothing, although I may use the wrong wording in my searches. I've also double-checked the upData function in Hmisc, but it does something else.

I'm wondering if one can update a dataframe by "forcing into" it a shorter dataframe containing the corrections, like the "update" provided in SAS data steps.

In this simple example:
a <- data.frame(id=c(1:5),x=rnorm(5))
b <- data.frame(id=4,x=rnorm(1))
> a

   id x

1  1  0.6557921
2  2  0.1897523
3  3  0.7976721
4  4  0.2107103
5  5 -0.8855786

> b
id x

1 4 0.8369147

I would like the "updated" dataframe to look like (row names are not important to me)

    id x

1   1  0.6557921
2   2  0.1897523
3   3  0.7976721
4   4  0.8369147
5   5 -0.8855786

I thought this could be done with merge, but this never removes the old version of a row, it just gives me two rows with id==4.

I thought of this solution:

reject <- a$id %in% b$id
a2 <- a[!reject,]
a3 <- rbind(a2,b)
> a3

    id x

1   1  0.6557921
2   2  0.1897523
3   3  0.7976721
5   5 -0.8855786
11  4  0.8369147

This works, and obviously it is not the best way to make the correction in a simple case like this. But providing a few lines of corrected data can be an effective method with large dataframes, especially if many identifier (grouping) variables are needed to identify each line that needs updating, and in this context my solution above rapidly becomes ugly.

Furthermore (but I can live with this constraint) this method removes entire rows, so I need to make sure the dataframe used to make corrections contains all the Y variables in the original dataframe, even those that do not need correcting.

If a method exists to just change one variable in 5 lines for a dataframe of 5000 lines and 30 variables, I'd appreciate learning about it. But I'll already be thrilled if I can update whole lines at a time.


Denis Chabot

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu Sep 21 05:47:46 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 20 Sep 2006 - 20:31:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.