1.
Excel uses the first 16 rows of data to determine if a column is numeric or
character. The data type which is most common in the first 16 rows will then
be used for the whole column. If you sort the data so that at least the
first 9 rows have character data, you may find this allows the data to be
interpreted as character. There is supposedly a registy setting that can
control how many lines to use (instead of 16), but I have not had success
with the setting. I suspect that ODBC uses JET4, which may be the real
source of the problem. See more here:
2.
The gregmisc bundle has a different read.xls function that uses a Perl
script (xls2csv) and seems to be safer with mixed-type columns.
Requires a working version of Perl.
Best,
Kevin Wright
The first column in my Excel sheet has mostly numbers but I need to treat it
as character data:
> library(RODBC)
> names(plasmid)
[1] "Plasmid Number" "Plasmid" "Concentration" "Comments" "Lost"
# How is the type decided? I need a character type.
[1] "numeric"
[1] "double"
> plasmid$"Plasmid Number"[273:276]
[1] 274 NA NA 276
The two NAs are supposed to be 275a and 275b. I tried the "as.is=TRUE" but
that didn't help.
I consulted Section 4, Relational databases, in the R Data Import/Export
document (for Version 2.2.0).
Section 4.2.2, Data types, was not helpful. In particular, this did not seem
helpful: "The more comprehensive of the R interface packages hide the type
conversion issues from the user."
Section 4.3.2, Package RODBC, provided a "simple example of using ODBC ..
with a(sic) Excel spreadsheet" but is silent on how to control the data type
on import. Could the documentation be expanded to address this issue?
I really need to show "Plasmid 275a" and "Plasmid 275b" instead of "Plasmid
NA".
Thanks for any help with this.
efg
>From my experience (somewhat of a guess):
http://www.dicks-blog.com/archives/2004/06/03/external-data-mixed-data-types/
<http://tolstoy.newcastle.edu.au/R/help/05/09/11324.html#14938qlink1>
*> channel <- odbcConnectExcel("U:/efg/lab/R/Plasmid/construct list.xls") *
*> plasmid <- sqlFetch(channel,"Sheet1", as.is=TRUE) *
*> odbcClose(channel) *
> class(plasmid$"Plasmid Number")
> typeof(plasmid$"Plasmid Number")
--
Earl F. Glynn
Scientific Programmer
Bioinformatics Department
[[alternative HTML version deleted]]
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Fri Nov 04 02:37:20 2005
This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:40:55 EST