[R] Fwd: read.table() and scientific notation

From: Alex Brown <alex_at_transitive.com>
Date: Tue 10 Oct 2006 - 12:01:12 GMT

note: this e-mail is supposed to precede my coerce hack one.

As an example of the other posters mentioning colClasses, with some debugging notes:

# create a pretend file for this example

> Lines <- scan(sep="\n", what="")
a 1 3e-8
b 2 1e+10
c 3 e-10
d 4 e+3

> file <- textConnection(Lines)

# import as you would a file, and specify the column types.
> T <- read.table(file, colClasses=list("character", "integer",
"double"))
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :

        scan() expected 'a real', got 'e-10'

# decide that's not very helpful. let's just import everything as character:
# restarting the file.
>file <- textConnection(Lines)
>T <- read.table(file, colClasses="character")
>lapply(T, mode)

$V1
[1] "character"

$V2
[1] "character"

$V3
[1] "character"

> T

   V1 V2 V3
1 a 1 3e-8
2 b 2 1e+10
3 c 3 e-10
4 d 4 e+3

# try the conversion to double:

> (D<-as.double(T$V3))

[1] 3e-08 1e+10 NA NA
Warning message:
NAs introduced by coercion

# let's see which are bad:
> T[is.na(D),]

   V1 V2 V3
3 c 3 e-10
4 d 4 e+3

-Alex

On 10 Oct 2006, at 12:17, January Weiner wrote:

> Oh, thanks, that was hint enough :-) I see it now. I turns that R does
> not understand
>
> e-10
>
> ...which stands for 1e-10 and is produced by some of the bioinformatic
> applications that I use (notably BLAST). However, R instead of being
> verbose on that just assumes that the whole column is a string.
>
> Is there a way to enforce a specific conversion in R (for example, to
> be able to see where the errors are?).
>
> January
>
> -- 
> ------------ January Weiner 3  ---------------------+---------------
> Division of Bioinformatics, University of Muenster  |  Schloßplatz 4
> (+49)(251)8321634                                   |  D48149 Münster
> http://www.uni-muenster.de/Biologie.Botanik/ebb/    |  Germany
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue Oct 10 22:09:34 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 10 Oct 2006 - 13:30:12 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.