Re: [R] "Missing value representation in Excel before extraction to R with RODBC"

From: Fredrik Lundgren <fredrik.bg.lundgren_at_bredband.net>
Date: Wed 11 Jan 2006 - 00:37:30 EST

Dear Petr,

Thank you for your help. I have tried (and succeded) to import myfile after conversion to text and with the help of read.table (also with the file = 'clipboard' alternative). Both methods give correct results, albeit dateformat turns up as a factor (minor problem). Also the read.xls from library gdata has been successful, albeit with some different representation of dateformat( minor problem). At least my Excelfile isn't corrupted in such a way to make this three ways impossible. No, my problem appears to be connected to the use of RODBC and that was what I wanted to get working. The first 8 rows are excluded from the file and columns with many NA's at the start are tranformed to all NA. If the NA's at the beginning of a column are given values (i. e. not NA) the tranformation of the column doesn't take place but the first 8 rows are still excluded. I have tried - not necessarily in a correct way - to use as.is (keeps the dateformat correct) and colClasses (doesn't apply?) but haven't been able to sort the problem out with these options.

Best wishes

Fredrik
----- Original Message -----
From: "Petr Pikal" <petr.pikal@precheza.cz> To: "Fredrik Lundgren" <fredrik.bg.lundgren@bredband.net>; "R-help" <r-help@stat.math.ethz.ch>
Sent: Tuesday, January 10, 2006 10:08 AM Subject: Re: [R] "Missing value representation in Excel before extraction to R with RODBC"

> Hi
>
> I tried to reproduce what you have told us by copy and paste
>
> read.delim("clipboard")
>
> but was not successful.
>
> Even with several blank values in each column in Excel i got correct
> import to R by this process. As I do not use RODBC I do not know all
> possible settings and features. If colClasses is available you can
> force the columns to by character, numeric, factor, Date or some
> other class.
>
> BTW Excel can be quite tricky and hides e.g. spaces in cells so you
> see them as empty even if they are not. So if I get some weird
> conversions of numeric columns there is often something hidden in
> Excel.
>
> HTH
> Petr
>
>
> On 9 Jan 2006 at 18:06, Fredrik Lundgren wrote:
>
> From: "Fredrik Lundgren" <fredrik.bg.lundgren@bredband.net>
> To: "Prof Brian Ripley" <ripley@stats.ox.ac.uk>,
> "Petr Pikal" <petr.pikal@precheza.cz>
> Copies to: "R-help" <r-help@stat.math.ethz.ch>
> Subject: Re: [R] "Missing value representation in Excel before
> extraction to R with RODBC"
> Date sent: Mon, 9 Jan 2006 18:06:49 +0100
>
>> Dear list,
>>
>> Well, those columns in Excel that starts with NA (actually 8 NA's in
>> my case) is imported as all NA in R but if the columns starts with at
>> least 3 cells with values (i.e not NA) the are imported correctly to
>> R. When as.is=TRUE is used a simular conversion takes place but now
>> as
>> all <NA> and dates are represented as date-and-time. Is there any way
>> to get this correct even when the Excel columns start with several
>> NA's?
>>
>> Sincerely
>> Fredrik
>>
>>
>> ----- Original Message -----
>> From: "Prof Brian Ripley" <ripley@stats.ox.ac.uk>
>> To: "Petr Pikal" <petr.pikal@precheza.cz>
>> Cc: "Fredrik Lundgren" <fredrik.bg.lundgren@bredband.net>; "R-help"
>> <r-help@stat.math.ethz.ch> Sent: Monday, January 09, 2006 9:36 AM
>> Subject: Re: [R] "Missing value representation in Excel before
>> extraction to R with RODBC"
>>
>>
>> > On Mon, 9 Jan 2006, Petr Pikal wrote:
>> >
>> >> Hi
>> >>
>> >> I believe it has something to do with the column identification
>> >> decision. When R decides what is in a column it uses only some
>> >> values from the beginning of a file.
>> >
>> > Not R, Excel. Excel tells ODBC what the column types are.
>> >
>> >> I do not use RODBC as read.delim("clipboard", ...) is usually more
>> >> convenient but probably there is a way how to tell RODBC what is
>> >> in
>> >> the column instead of let R decide from the top of the file.
>> >
>> > Using as.is=TRUE stops RODBC doing any conversion.
>> >
>> >> But I may be completely mistaken.
>> >>
>> >> HTH
>> >> Petr
>> >>
>> >>
>> >> On 6 Jan 2006 at 20:47, Fredrik Lundgren wrote:
>> >>
>> >> From: "Fredrik Lundgren"
>> >> <fredrik.bg.lundgren@bredband.net> To: "R-help"
>> >> <r-help@stat.math.ethz.ch> Date sent: Fri, 6 Jan 2006
>> >> 20:47:29
>> >> +0100 Subject: [R] "Missing value representation in Excel
>> >> before extraction to R with RODBC"
>> >>
>> >>> Dear list,
>> >>>
>> >>> How should missing values be expressed in Excel before extraction
>> >>> to R via RODBC. I'm bewildered. Sometimes the representation with
>> >>> NA in Excel appears to work and shows up in R as <NA> but
>> >>> sometimes the use of NA in Excel changes the whole vector to
>> >>> NA's.
>> >>> Blank or nothing or NA as representation for missing values in
>> >>> Excel with dateformat gives NA's of the whole vector in R but
>> >>> with
>> >>> general format in Excel gives blanks for missing values in R.
>> >>> How
>> >>> should I represent missing values in Excel?
>> >>>
>> >>>
>> >>> Best wishes and thanks for any help
>> >>> Fredrik Lundgren
>> >
>> > --
>> > Brian D. Ripley, ripley@stats.ox.ac.uk
>> > Professor of Applied Statistics,
>> > http://www.stats.ox.ac.uk/~ripley/
>> > University of Oxford, Tel: +44 1865 272861 (self) 1
>> > South Parks Road, +44 1865 272866 (PA) Oxford
>> > OX1 3TG, UK Fax: +44 1865 272595
>> >
>>
>>
>
> Petr Pikal
> petr.pikal@precheza.cz
>
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jan 11 00:50:29 2006

This archive was generated by hypermail 2.1.8 : Wed 11 Jan 2006 - 02:12:40 EST