Re: [R] RODBC, missing values, and Excel

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Thu 13 Jul 2006 - 12:24:12 EST

Would it be good enough to just read all the sheets in?

The perl program can do that and although the read.xls R function does not interface to that aspect of its functionality its not that difficult to access it yourself. Assume your excel file is in \test.xls . Just switch to that folder. paste together a command to run the perl program, run it, get a list of the file names it produced and read them in:

library(gdata)
setwd("/")
cmd <- paste("perl", system.file("perl/xls2csv.pl", package = "gdata"), "test") system(cmd)
ff <- list.files(patt = "test_Sheet.*.csv") sapply(ff, read.csv, na.strings = "na ", simplify = FALSE)

On 7/12/06, Sundar Dorai-Raj <sundar.dorai-raj@pdf.com> wrote:
> Hi, Gabor,
>
> Thanks for the reply. Perhaps Prof. Ripley will enlighten us as he is
> the RODBC maintainer.
>
> Unfortunately, gdata::read.xls will not work for me (at least I don't
> think it will) because I need to refer to each worksheet by name and not
> by number. For example, I need extract data from "Sheet1" and not simply
> the first sheet.
>
> Thanks,
>
> --sundar
>
> Gabor Grothendieck wrote:
> > I also got a strange result too (I renamed it sdr.read.xls
> > to distinguish it from read.xls in gdata and noticed that a
> > space got into my na's somehow so I used "na " for my
> > na.strings:
> >
> >
> >>sdr.read.xls("/test.xls", "Sheet2", na.strings = "na ")
> >
> > x
> > 1 <NA>
> > 2 <NA>
> > 3 na
> > 4 na
> > 5 na
> > 6 na
> > 7 <NA>
> >
> > I had more success using read.xls in the gdata package.
> > Note that we need to install perl first if not already present:
> >
> >
> >>library(gdata) # for read.xls
> >>read.xls("/test.xls", 2, na.strings = "na ")
> >
> > x
> > 1 0.11
> > 2 0.11
> > 3 NA
> > 4 NA
> > 5 NA
> > 6 NA
> > 7 0.11
> >
> >
> >>R.version.string # XP
> >
> > [1] "Version 2.3.1 Patched (2006-06-04 r38279)"
> >
> >>packageDescription("gdata")$Version
> >
> > [1] "2.1.2"
> >
> >>packageDescription("RODBC")$Version
> >
> > [1] "1.1-7"
> >
> >
> > On 7/12/06, Sundar Dorai-Raj <sundar.dorai-raj@pdf.com> wrote:
> >
> >>Hi, all,
> >>
> >>I'm trying to use RODBC to read data from Excel. However, I'm having
> >>trouble converting missing values to NA and rather perplexed by the
> >>output. Below illustrates my problem:
> >>
> >>## DATA - copy to Excel and save as "tmp.xls"
> >>## tmp.xls!Sheet1
> >>x
> >>0.11
> >>0.11
> >>na
> >>na
> >>na
> >>0.11
> >>
> >>## tmp.xls!Sheet2
> >>x
> >>0.11
> >>0.11
> >>na
> >>na
> >>na
> >>na
> >>0.11
> >>
> >>## R Code
> >>read.xls <- function(file, sheet = "Sheet1", ...) {
> >> require(RODBC)
> >> channel <- odbcConnectExcel(file)
> >> sheet <- sprintf("select * from `%s$`", sheet)
> >> x <- sqlQuery(channel, sheet, ...)
> >> odbcClose(channel)
> >> x
> >>}
> >>
> >>read.xls("./tmp.xls", "Sheet1", na.strings = "na")
> >>## works as expected
> >># x
> >>#1 0.11
> >>#2 0.11
> >>#3 NA
> >>#4 NA
> >>#5 NA
> >>#6 0.11
> >>
> >>read.xls("./tmp.xls", "Sheet2", na.strings = "na")
> >>## Huh? What happened?
> >># x
> >>#1 NA
> >>#2 NA
> >>#3 NA
> >>#4 NA
> >>#5 NA
> >>#6 NA
> >>#7 NA
> >>
> >> > sessionInfo()
> >>Version 2.3.1 (2006-06-01)
> >>i386-pc-mingw32
> >>
> >>attached base packages:
> >>[1] "methods" "stats" "graphics" "grDevices" "utils" "datasets"
> >>[7] "base"
> >>
> >>other attached packages:
> >> RODBC
> >>"1.1-7"
> >>
> >>______________________________________________
> >>R-help@stat.math.ethz.ch mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >>
> >
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jul 13 12:30:31 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 13 Jul 2006 - 14:14:09 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.