Re: [R] RODBC, missing values, and Excel

From: Sundar Dorai-Raj <sundar.dorai-raj_at_pdf.com>
Date: Thu 13 Jul 2006 - 21:43:56 EST

Hi, Gabor,

Thanks for the code. When I tried this I get an error when trying to use a relative path name:

read.excel <- function(file, sheet, ...) {

   require(rcom)
   require(gdata)
   oxl <- comCreateObject('Excel.Application')    comSetProperty(oxl, "Visible", TRUE) # this line optional    owb <- comGetProperty(oxl, "Workbooks")    ob <- comInvoke(owb, "Open", file)
   osheets <- comGetProperty(ob, "Worksheets")    n <- comGetProperty(osheets, "Count")    ithSheetName <- function(i)
     comGetProperty(comGetProperty(osheets, "Item", i), "Name")    sheetNames <- sapply(1:n, ithSheetName)    comInvoke(oxl, "Quit")
   read.xls(file, match(sheet, sheetNames), ...) }

 > read.excel("tmp.xls", "Sheet2", na.strings = "na") Error in 1:n : NA/NaN argument
 > read.excel("D:/Users/sundard/frm/config/R/tmp.xls",

+            "Sheet2", na.strings = "na")
       x

1 0.11
2 0.11
3 NA
4 NA
5 NA
6 NA
7 0.11

Any reason I need an absolute path?

Thanks again,

--sundar

Gabor Grothendieck wrote:
> In thinking about this some more I have a better idea. Use rcom (or
> RDCOMClient)
> to get a list of the sheet names and then use that to determine which sheet you
> need. Then use read.xls to get it like this assuming that the Excel
> file and path are C:\test.xls and that one of the sheets in that spreadsheet
> is xyz. In my version the na.strings had a space at the end so you may
> need to change the na.strings= setting:
>
> library(rcom)
> xls <- "C:\\test.xls"
> oxl <- comCreateObject('Excel.Application')
> comSetProperty(oxl, "Visible", TRUE) # this line optional
> owb <- comGetProperty(oxl, "Workbooks")
> ob <- comInvoke(owb, "Open", xls)
> osheets <- comGetProperty(ob, "Worksheets")
> n <- comGetProperty(osheets, "Count")
> ithSheetName <- function(i)
> comGetProperty(comGetProperty(osheets, "Item", i), "Name")
> sheetNames <- sapply(1:n, ithSheetName)
> comInvoke(oxl, "Quit")
>
> library(gdata)
> read.xls(xls, match("xyz", sheetNames), na.strings = "na ")
>
>
> On 7/12/06, Gabor Grothendieck <ggrothendieck@gmail.com> wrote:
>

>>Would it be good enough to just read all the sheets in?
>>
>>The perl program can do that and although the read.xls R function does not
>>interface to that aspect of its functionality its not that difficult to access
>>it yourself.  Assume your excel file is in \test.xls .  Just
>>switch to that folder.  paste together a command to run the perl
>>program, run it, get a list of the file names it produced and read them in:
>>
>>library(gdata)
>>setwd("/")
>>cmd <- paste("perl", system.file("perl/xls2csv.pl", package = "gdata"), "test")
>>system(cmd)
>>ff <- list.files(patt = "test_Sheet.*.csv")
>>sapply(ff, read.csv, na.strings = "na ", simplify = FALSE)
>>
>>
>>On 7/12/06, Sundar Dorai-Raj <sundar.dorai-raj@pdf.com> wrote:
>>
>>>Hi, Gabor,
>>>
>>>Thanks for the reply. Perhaps Prof. Ripley will enlighten us as he is
>>>the RODBC maintainer.
>>>
>>>Unfortunately, gdata::read.xls will not work for me (at least I don't
>>>think it will) because I need to refer to each worksheet by name and not
>>>by number. For example, I need extract data from "Sheet1" and not simply
>>>the first sheet.
>>>
>>>Thanks,
>>>
>>>--sundar
>>>
>>>Gabor Grothendieck wrote:
>>>
>>>>I also got a strange result too (I renamed it sdr.read.xls
>>>>to distinguish it from read.xls in gdata and noticed that a
>>>>space got into my na's somehow so I used "na " for my
>>>>na.strings:
>>>>
>>>>
>>>>
>>>>>sdr.read.xls("/test.xls", "Sheet2", na.strings = "na ")
>>>>
>>>>     x
>>>>1 <NA>
>>>>2 <NA>
>>>>3   na
>>>>4   na
>>>>5   na
>>>>6   na
>>>>7 <NA>
>>>>
>>>>I had more success using read.xls in the gdata package.
>>>>Note that we need to install perl first if not already present:
>>>>
>>>>
>>>>
>>>>>library(gdata)  # for read.xls
>>>>>read.xls("/test.xls", 2, na.strings = "na ")
>>>>
>>>>     x
>>>>1 0.11
>>>>2 0.11
>>>>3   NA
>>>>4   NA
>>>>5   NA
>>>>6   NA
>>>>7 0.11
>>>>
>>>>
>>>>
>>>>>R.version.string # XP
>>>>
>>>>[1] "Version 2.3.1 Patched (2006-06-04 r38279)"
>>>>
>>>>
>>>>>packageDescription("gdata")$Version
>>>>
>>>>[1] "2.1.2"
>>>>
>>>>
>>>>>packageDescription("RODBC")$Version
>>>>
>>>>[1] "1.1-7"
>>>>
>>>>
>>>>On 7/12/06, Sundar Dorai-Raj <sundar.dorai-raj@pdf.com> wrote:
>>>>
>>>>
>>>>>Hi, all,
>>>>>
>>>>>I'm trying to use RODBC to read data from Excel. However, I'm having
>>>>>trouble converting missing values to NA and rather perplexed by the
>>>>>output. Below illustrates my problem:
>>>>>
>>>>>## DATA - copy to Excel and save as "tmp.xls"
>>>>>## tmp.xls!Sheet1
>>>>>x
>>>>>0.11
>>>>>0.11
>>>>>na
>>>>>na
>>>>>na
>>>>>0.11
>>>>>
>>>>>## tmp.xls!Sheet2
>>>>>x
>>>>>0.11
>>>>>0.11
>>>>>na
>>>>>na
>>>>>na
>>>>>na
>>>>>0.11
>>>>>
>>>>>## R Code
>>>>>read.xls <- function(file, sheet = "Sheet1", ...) {
>>>>> require(RODBC)
>>>>> channel <- odbcConnectExcel(file)
>>>>> sheet <- sprintf("select * from `%s$`", sheet)
>>>>> x <- sqlQuery(channel, sheet, ...)
>>>>> odbcClose(channel)
>>>>> x
>>>>>}
>>>>>
>>>>>read.xls("./tmp.xls", "Sheet1", na.strings = "na")
>>>>>## works as expected
>>>>>#     x
>>>>>#1 0.11
>>>>>#2 0.11
>>>>>#3   NA
>>>>>#4   NA
>>>>>#5   NA
>>>>>#6 0.11
>>>>>
>>>>>read.xls("./tmp.xls", "Sheet2", na.strings = "na")
>>>>>## Huh? What happened?
>>>>>#   x
>>>>>#1 NA
>>>>>#2 NA
>>>>>#3 NA
>>>>>#4 NA
>>>>>#5 NA
>>>>>#6 NA
>>>>>#7 NA
>>>>>
>>>>>
>>>>>>sessionInfo()
>>>>>
>>>>>Version 2.3.1 (2006-06-01)
>>>>>i386-pc-mingw32
>>>>>
>>>>>attached base packages:
>>>>>[1] "methods"   "stats"     "graphics"  "grDevices" "utils"     "datasets"
>>>>>[7] "base"
>>>>>
>>>>>other attached packages:
>>>>> RODBC
>>>>>"1.1-7"
>>>>>
>>>>>______________________________________________
>>>>>R-help@stat.math.ethz.ch mailing list
>>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>>>>
>>>>
>>>>
>>>>______________________________________________
>>>>R-help@stat.math.ethz.ch mailing list
>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>>

>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jul 13 21:49:38 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 13 Jul 2006 - 22:15:28 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.