Re: [R] 2 Seemingly Simple Problems

About this list Date view Thread view Subject view Author view Attachment view

From: ripley@stats.ox.ac.uk
Date: Sat 01 Jun 2002 - 00:07:34 EST


Message-id: <Pine.LNX.4.31.0205311459050.25312-100000@gannet.stats>

On Fri, 31 May 2002, MATT BORKOWSKI wrote:

> Alright...these two issues seem rather simple. But I had trouble finding much
> about either of them in the archives.
>
> 1) Using scan()
> I'm trying to use scan to read in a large data set since read.table() is taking
> quite a bit of time. But when I try to do this I receive a error message along
> the lines of "Character where numeric expected." Seems to me the problem is
> arising because my data is composed of both characters and numbers, but R
> is only expecting numerics. I assume the key to this problem lies in the
> "what=" parameter. But I'm not sure what to set this to so that R expects
> characters or numbers.

See the help page for scan, especially the examples. However, since
read.table calls scan itself, you will get little gain provided you use
colClasses in read.table.

> 2) Testing for 'NA' values
> In this problem I have read in a large data set. Some of the lines of data are
> not as long and therefore the last few columns have been filled in with 'NA.'
> Now I'm trying to read through rows of data backwards because the parameter
> I'm trying to extract from the data.frame is not always in column 5 but is always
> the second real value after the 'NA's' if that makes any sense. But I don't think

(No. The NAs are at the end of the row, so the second before?)

> that's all that important anyway. The point is...I'm trying to extract the second
> value after the 'NA' values by ignoring the 'NA' values and couting any real
> values. I'm trying to accomplish this with:
>
> if(data[r,c] != NA) count <- count +1
>
> However, I receive the error: "Value missing where logical expected". I assume
> this is happening because I'm testing for 'NA' values. Is there anyway around
> this? Is there a way to count the number of 'NA' numbers or a way to skip over
> them?

is.na(data[r,]) would be a good start. Something like

{xx <- is.na(data[r,]); n <- length(xx); data[r, n-1]}

for one row perhaps? Or to vectorize

nn <- colSums(!is.na(data)) # number of non-NA values in each row
data[cbind(seq(along=nn), nn-1)]

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._


About this list Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.3 : Wed 16 Oct 2002 - 11:57:19 EST