[R] Problems with Merge

From: Patnaik, Tirthankar <tirthankar.patnaik_at_citi.com>
Date: Tue, 05 Jun 2007 17:26:23 +0530


Hi,
 I have a history dataset, a matrix with about 1590 obs, and 242 cols, and I need to update this matrix with an 'update' matrix that has about 30 rows, and roughly similar number of columns as the history ds (but not necessarily equal). The update dataset is read from an Excel ODBC connection. When I try and merge these datasets, I get counter-intuitive results.

library(RODBC)

   chn <- odbcConnectExcel(UpdateFile)
   sqlTables(chn)
   UpdateData <- sqlFetch(chn,"MCap243")    colnames(UpdateData) <- gsub("#",".",colnames(UpdateData))    close(chn)
# specify just how many rows we need from the Update file. We'd only
read five

# rows at a time.

   UpdateRows = 20
   UpdateData <- UpdateData[1:UpdateRows,]

# Delete Unwanted stocks.

   UpdateData <- UpdateData[,!names(UpdateData) %in% ToBeDeleted]    x <- tail(UpdateData[c("Date","ABAN.BO")],n=50)    print(x)    

Gives x as:

         Date  ABAN.BO
1  2007-04-30 96448.40
2  2007-05-01 96448.40
3  2007-05-02 96448.40
4  2007-05-03 96300.44
5  2007-05-04 93718.52
6  2007-05-05 93718.52
7  2007-05-06 93718.52
8  2007-05-07 92743.82
9  2007-05-08 90374.60
10 2007-05-09 89126.18
11 2007-05-10 87082.47
12 2007-05-11 85493.73
13 2007-05-12 85493.73
14 2007-05-13 85493.73
15 2007-05-14 85033.21
16 2007-05-15 89209.41
17 2007-05-16 89089.19
18 2007-05-17 90472.62
19 2007-05-18 90326.51
20 2007-05-19 90326.51

But when I merge this file with the history dataset, I get the dates misaligned by one row.

whistory <- merge(rhistory,UpdateData,by.x=rhistory["Date"],all=TRUE)

tail(WHist4[c("Date","ABAN.BO")],n=30)

          Date ABAN.BO

1581 2007-04-19 83632.60
1582 2007-04-20 85942.00
1583 2007-04-23 88244.00
1584 2007-04-24 90309.50
1585 2007-04-25 92048.00
1586 2007-04-26 92051.70
1587 2007-04-27 95863.10
1588 2007-04-29 96448.40
1589 2007-04-30 96448.40
1590 2007-04-30 96343.40
1591 2007-05-01 96448.40
1592 2007-05-02 96300.44
1593 2007-05-03 93718.52
1594 2007-05-03 96195.60
1595 2007-05-04 93718.52
1596 2007-05-04 93616.50
1597 2007-05-05 93718.52
1598 2007-05-06 92743.82
1599 2007-05-07 90374.60
1600 2007-05-08 89126.18
1601 2007-05-09 87082.47
1602 2007-05-10 85493.73
1603 2007-05-11 85493.73
1604 2007-05-12 85493.73
1605 2007-05-13 85033.21
1606 2007-05-14 89209.41
1607 2007-05-15 89089.19
1608 2007-05-16 90472.62
1609 2007-05-17 90326.51
1610 2007-05-18 90326.51

Any reasons why the dates are shifted by one date? Am I missing some parameters in the merge statement?

TIA and best,
-Tir



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 05 Jun 2007 - 12:09:23 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 05 Jun 2007 - 13:31:44 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.