Re: [R] Problems with Merge

From: Patnaik, Tirthankar <tirthankar.patnaik_at_citi.com>
Date: Tue, 05 Jun 2007 18:32:54 +0530


Prof. Ripley,

        Date is a column in a matrix, which should be ok for the merge. Actually the IMHO problem was with the default sorting option, I set sort=FALSE, and the merge happened as it should.

Best,
-Tir

Tirthankar Patnaik
Analyst, India Strategy
Citigroup Investment Research
+91-22-6631 9887

P.S. Agree with you on the trailer of the message--I had to post the message in a hurry, though!

> -----Original Message-----
> From: Prof Brian Ripley [mailto:ripley_at_stats.ox.ac.uk]
> Sent: Tuesday, June 05, 2007 6:08 PM
> To: Patnaik, Tirthankar [GWM-CIR]
> Cc: r-help_at_stat.math.ethz.ch
> Subject: Re: [R] Problems with Merge
>
> Take a look at the help for merge(): in all the examples by.x
> is a character string, not a one-column data frame which is
> what rhistory["Date"] would appear to be.
>
> Please note the trailer of this messsage.
>
> On Tue, 5 Jun 2007, Patnaik, Tirthankar wrote:
>
> > Hi,
> > I have a history dataset, a matrix with about 1590 obs, and
> 242 cols,
> > and I need to update this matrix with an 'update' matrix that has
> > about 30 rows, and roughly similar number of columns as the
> history ds
> > (but not necessarily equal). The update dataset is read
> from an Excel
> > ODBC connection. When I try and merge these datasets, I get
> > counter-intuitive results.
> >
> > library(RODBC)
> > chn <- odbcConnectExcel(UpdateFile)
> > sqlTables(chn)
> > UpdateData <- sqlFetch(chn,"MCap243")
> > colnames(UpdateData) <- gsub("#",".",colnames(UpdateData))
> > close(chn)
> > # specify just how many rows we need from the Update
> file. We'd only
> > read five
> > # rows at a time.
> > UpdateRows = 20
> > UpdateData <- UpdateData[1:UpdateRows,]
> >
> > # Delete Unwanted stocks.
> > UpdateData <- UpdateData[,!names(UpdateData) %in% ToBeDeleted]
> > x <- tail(UpdateData[c("Date","ABAN.BO")],n=50)
> > print(x)
> >
> >
> > Gives x as:
> >
> > Date ABAN.BO
> > 1 2007-04-30 96448.40
> > 2 2007-05-01 96448.40
> > 3 2007-05-02 96448.40
> > 4 2007-05-03 96300.44
> > 5 2007-05-04 93718.52
> > 6 2007-05-05 93718.52
> > 7 2007-05-06 93718.52
> > 8 2007-05-07 92743.82
> > 9 2007-05-08 90374.60
> > 10 2007-05-09 89126.18
> > 11 2007-05-10 87082.47
> > 12 2007-05-11 85493.73
> > 13 2007-05-12 85493.73
> > 14 2007-05-13 85493.73
> > 15 2007-05-14 85033.21
> > 16 2007-05-15 89209.41
> > 17 2007-05-16 89089.19
> > 18 2007-05-17 90472.62
> > 19 2007-05-18 90326.51
> > 20 2007-05-19 90326.51
> >
> > But when I merge this file with the history dataset, I get
> the dates
> > misaligned by one row.
> >
> > whistory <-
> merge(rhistory,UpdateData,by.x=rhistory["Date"],all=TRUE)
> >
> > tail(WHist4[c("Date","ABAN.BO")],n=30)
> >
> > Date ABAN.BO
> > 1581 2007-04-19 83632.60
> > 1582 2007-04-20 85942.00
> > 1583 2007-04-23 88244.00
> > 1584 2007-04-24 90309.50
> > 1585 2007-04-25 92048.00
> > 1586 2007-04-26 92051.70
> > 1587 2007-04-27 95863.10
> > 1588 2007-04-29 96448.40
> > 1589 2007-04-30 96448.40
> > 1590 2007-04-30 96343.40
> > 1591 2007-05-01 96448.40
> > 1592 2007-05-02 96300.44
> > 1593 2007-05-03 93718.52
> > 1594 2007-05-03 96195.60
> > 1595 2007-05-04 93718.52
> > 1596 2007-05-04 93616.50
> > 1597 2007-05-05 93718.52
> > 1598 2007-05-06 92743.82
> > 1599 2007-05-07 90374.60
> > 1600 2007-05-08 89126.18
> > 1601 2007-05-09 87082.47
> > 1602 2007-05-10 85493.73
> > 1603 2007-05-11 85493.73
> > 1604 2007-05-12 85493.73
> > 1605 2007-05-13 85033.21
> > 1606 2007-05-14 89209.41
> > 1607 2007-05-15 89089.19
> > 1608 2007-05-16 90472.62
> > 1609 2007-05-17 90326.51
> > 1610 2007-05-18 90326.51
> >
> > Any reasons why the dates are shifted by one date? Am I
> missing some
> > parameters in the merge statement?
> >
> > TIA and best,
> > -Tir
> >
> > ______________________________________________
> > R-help_at_stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Brian D. Ripley, ripley_at_stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 05 Jun 2007 - 13:15:47 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 05 Jun 2007 - 13:31:44 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.