Re: [R] RODBC, missing values, and Excel

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Thu 13 Jul 2006 - 21:50:47 EST

Perhaps the Excel API requires an absolute path name. Try:

chartr("/", "\\", file.path(getwd(), "myfile.xls"))

where we make sure its using backslashes in case that's also required.

On 7/13/06, Sundar Dorai-Raj <sundar.dorai-raj@pdf.com> wrote:
> Hi, Gabor,
>
> Thanks for the code. When I tried this I get an error when trying to use
> a relative path name:
>
> read.excel <- function(file, sheet, ...) {
> require(rcom)
> require(gdata)
> oxl <- comCreateObject('Excel.Application')
> comSetProperty(oxl, "Visible", TRUE) # this line optional
> owb <- comGetProperty(oxl, "Workbooks")
> ob <- comInvoke(owb, "Open", file)
> osheets <- comGetProperty(ob, "Worksheets")
> n <- comGetProperty(osheets, "Count")
> ithSheetName <- function(i)
> comGetProperty(comGetProperty(osheets, "Item", i), "Name")
> sheetNames <- sapply(1:n, ithSheetName)
> comInvoke(oxl, "Quit")
> read.xls(file, match(sheet, sheetNames), ...)
> }
>
> > read.excel("tmp.xls", "Sheet2", na.strings = "na")
> Error in 1:n : NA/NaN argument
> > read.excel("D:/Users/sundard/frm/config/R/tmp.xls",
> + "Sheet2", na.strings = "na")
> x
> 1 0.11
> 2 0.11
> 3 NA
> 4 NA
> 5 NA
> 6 NA
> 7 0.11
>
> Any reason I need an absolute path?
>
> Thanks again,
>
> --sundar
>
> Gabor Grothendieck wrote:
> > In thinking about this some more I have a better idea. Use rcom (or
> > RDCOMClient)
> > to get a list of the sheet names and then use that to determine which sheet you
> > need. Then use read.xls to get it like this assuming that the Excel
> > file and path are C:\test.xls and that one of the sheets in that spreadsheet
> > is xyz. In my version the na.strings had a space at the end so you may
> > need to change the na.strings= setting:
> >
> > library(rcom)
> > xls <- "C:\\test.xls"
> > oxl <- comCreateObject('Excel.Application')
> > comSetProperty(oxl, "Visible", TRUE) # this line optional
> > owb <- comGetProperty(oxl, "Workbooks")
> > ob <- comInvoke(owb, "Open", xls)
> > osheets <- comGetProperty(ob, "Worksheets")
> > n <- comGetProperty(osheets, "Count")
> > ithSheetName <- function(i)
> > comGetProperty(comGetProperty(osheets, "Item", i), "Name")
> > sheetNames <- sapply(1:n, ithSheetName)
> > comInvoke(oxl, "Quit")
> >
> > library(gdata)
> > read.xls(xls, match("xyz", sheetNames), na.strings = "na ")
> >
> >
> > On 7/12/06, Gabor Grothendieck <ggrothendieck@gmail.com> wrote:
> >
> >>Would it be good enough to just read all the sheets in?
> >>
> >>The perl program can do that and although the read.xls R function does not
> >>interface to that aspect of its functionality its not that difficult to access
> >>it yourself. Assume your excel file is in \test.xls . Just
> >>switch to that folder. paste together a command to run the perl
> >>program, run it, get a list of the file names it produced and read them in:
> >>
> >>library(gdata)
> >>setwd("/")
> >>cmd <- paste("perl", system.file("perl/xls2csv.pl", package = "gdata"), "test")
> >>system(cmd)
> >>ff <- list.files(patt = "test_Sheet.*.csv")
> >>sapply(ff, read.csv, na.strings = "na ", simplify = FALSE)
> >>
> >>
> >>On 7/12/06, Sundar Dorai-Raj <sundar.dorai-raj@pdf.com> wrote:
> >>
> >>>Hi, Gabor,
> >>>
> >>>Thanks for the reply. Perhaps Prof. Ripley will enlighten us as he is
> >>>the RODBC maintainer.
> >>>
> >>>Unfortunately, gdata::read.xls will not work for me (at least I don't
> >>>think it will) because I need to refer to each worksheet by name and not
> >>>by number. For example, I need extract data from "Sheet1" and not simply
> >>>the first sheet.
> >>>
> >>>Thanks,
> >>>
> >>>--sundar
> >>>
> >>>Gabor Grothendieck wrote:
> >>>
> >>>>I also got a strange result too (I renamed it sdr.read.xls
> >>>>to distinguish it from read.xls in gdata and noticed that a
> >>>>space got into my na's somehow so I used "na " for my
> >>>>na.strings:
> >>>>
> >>>>
> >>>>
> >>>>>sdr.read.xls("/test.xls", "Sheet2", na.strings = "na ")
> >>>>
> >>>> x
> >>>>1 <NA>
> >>>>2 <NA>
> >>>>3 na
> >>>>4 na
> >>>>5 na
> >>>>6 na
> >>>>7 <NA>
> >>>>
> >>>>I had more success using read.xls in the gdata package.
> >>>>Note that we need to install perl first if not already present:
> >>>>
> >>>>
> >>>>
> >>>>>library(gdata) # for read.xls
> >>>>>read.xls("/test.xls", 2, na.strings = "na ")
> >>>>
> >>>> x
> >>>>1 0.11
> >>>>2 0.11
> >>>>3 NA
> >>>>4 NA
> >>>>5 NA
> >>>>6 NA
> >>>>7 0.11
> >>>>
> >>>>
> >>>>
> >>>>>R.version.string # XP
> >>>>
> >>>>[1] "Version 2.3.1 Patched (2006-06-04 r38279)"
> >>>>
> >>>>
> >>>>>packageDescription("gdata")$Version
> >>>>
> >>>>[1] "2.1.2"
> >>>>
> >>>>
> >>>>>packageDescription("RODBC")$Version
> >>>>
> >>>>[1] "1.1-7"
> >>>>
> >>>>
> >>>>On 7/12/06, Sundar Dorai-Raj <sundar.dorai-raj@pdf.com> wrote:
> >>>>
> >>>>
> >>>>>Hi, all,
> >>>>>
> >>>>>I'm trying to use RODBC to read data from Excel. However, I'm having
> >>>>>trouble converting missing values to NA and rather perplexed by the
> >>>>>output. Below illustrates my problem:
> >>>>>
> >>>>>## DATA - copy to Excel and save as "tmp.xls"
> >>>>>## tmp.xls!Sheet1
> >>>>>x
> >>>>>0.11
> >>>>>0.11
> >>>>>na
> >>>>>na
> >>>>>na
> >>>>>0.11
> >>>>>
> >>>>>## tmp.xls!Sheet2
> >>>>>x
> >>>>>0.11
> >>>>>0.11
> >>>>>na
> >>>>>na
> >>>>>na
> >>>>>na
> >>>>>0.11
> >>>>>
> >>>>>## R Code
> >>>>>read.xls <- function(file, sheet = "Sheet1", ...) {
> >>>>> require(RODBC)
> >>>>> channel <- odbcConnectExcel(file)
> >>>>> sheet <- sprintf("select * from `%s$`", sheet)
> >>>>> x <- sqlQuery(channel, sheet, ...)
> >>>>> odbcClose(channel)
> >>>>> x
> >>>>>}
> >>>>>
> >>>>>read.xls("./tmp.xls", "Sheet1", na.strings = "na")
> >>>>>## works as expected
> >>>>># x
> >>>>>#1 0.11
> >>>>>#2 0.11
> >>>>>#3 NA
> >>>>>#4 NA
> >>>>>#5 NA
> >>>>>#6 0.11
> >>>>>
> >>>>>read.xls("./tmp.xls", "Sheet2", na.strings = "na")
> >>>>>## Huh? What happened?
> >>>>># x
> >>>>>#1 NA
> >>>>>#2 NA
> >>>>>#3 NA
> >>>>>#4 NA
> >>>>>#5 NA
> >>>>>#6 NA
> >>>>>#7 NA
> >>>>>
> >>>>>
> >>>>>>sessionInfo()
> >>>>>
> >>>>>Version 2.3.1 (2006-06-01)
> >>>>>i386-pc-mingw32
> >>>>>
> >>>>>attached base packages:
> >>>>>[1] "methods" "stats" "graphics" "grDevices" "utils" "datasets"
> >>>>>[7] "base"
> >>>>>
> >>>>>other attached packages:
> >>>>> RODBC
> >>>>>"1.1-7"
> >>>>>
> >>>>>______________________________________________
> >>>>>R-help@stat.math.ethz.ch mailing list
> >>>>>https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >>>>>
> >>>>
> >>>>
> >>>>______________________________________________
> >>>>R-help@stat.math.ethz.ch mailing list
> >>>>https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >>>
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jul 13 21:55:41 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 13 Jul 2006 - 22:15:28 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.