Re: [R] na.approx and columns with NA's

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Mon, 28 May 2007 08:10:49 -0400

Without a small reproducible example there is not much to say. Try cutting the columns down to half successively until you have an object with 4 columns that exhibits the same behavior and then do the same with rows until you get a 4x6 example.

Here is another, slightly shorter, solution:

library(zoo)

# test data
z <- zoo(matrix(1:24, 6))

z[,2:3] <- NA
z[1, 2] <- 3
z[2, 1] <- NA

z

# calculate
idx <- colSums(!is.na(z)) > 1
z[,idx] <- na.approx(z[,idx])
z

On 5/28/07, antonio rodriguez <antonio.raju_at_gmail.com> wrote:
> Dear Gabor,
>
> In order to perform your suggestion I needed to split my 'big' 720*5551
> matrix into small ones of the type: 720*400 due to memory constraints.
> But after performing the task I get less rows in the new matrix. For
> example:
>
> zz1<-zz[,1:400]
> dim(zz1)
>
> [1] 720 400
>
> zz1[,1]
>
> 1985-01-05 1985-01-13 1985-01-21 1985-01-29 1985-02-06 1985-02-14 1985-02-22
> NA 16.72500 16.50000 16.68750 15.90000 NA 16.20000
> 1985-03-02 1985-03-10 1985-03-18 1985-03-26 1985-04-03 1985-04-11 1985-04-19
> 16.50000 16.20000 15.90000 16.35000 16.27500 16.87500 16.87500
> ........................................................................................................................................
>
> idx <- colSums(!!zz1, na.rm = TRUE) > 1
> zz1[,idx] <- na.approx(zz1[,idx])
> dim(zz1)
>
> [1] 718 400
>
> I've done something similar to your example with random data, but with
> the same number of rows from my original data:
>
> u <- zoo(matrix(rnorm(4320), 6))
> u<-t(u)
>
> dim(u)
> [1] 720 6
>
> u[,2:3] <- NA
> u[1, 2] <- 3
> u[2, 1] <- NA
> idx <- colSums(!!u, na.rm = TRUE) > 1
> u[,idx] <- na.approx(u[,idx])
>
> dim(u)
> [1] 720 6
>
> Don't know what could be happening to my original data.
>
> Best regards
>
> Antonio
>
>
>
> Gabor Grothendieck escribió:
> > na.approx uses approx and has the same behavior as it. Try this:
> >
> >> library(zoo)
> >>
> >> # test data
> >> z <- zoo(matrix(1:24, 6))
> >> z[,2:3] <- NA
> >> z[1, 2] <- 3
> >> z[2, 1] <- NA
> >> z
> >
> > 1 1 3 NA 19
> > 2 NA NA NA 20
> > 3 3 NA NA 21
> > 4 4 NA NA 22
> > 5 5 NA NA 23
> > 6 6 NA NA 24
> >>
> >> # TRUE for each column that has more than 1 non-NA
> >> idx <- colSums(!!z, na.rm = TRUE) > 1
> >> idx
> > [1] TRUE FALSE FALSE TRUE
> >>
> >> z[,idx] <- na.approx(z[,idx])
> >> z
> >
> > 1 1 3 NA 19
> > 2 2 NA NA 20
> > 3 3 NA NA 21
> > 4 4 NA NA 22
> > 5 5 NA NA 23
> > 6 6 NA NA 24
> >
> >
> > On 5/27/07, antonio rodriguez <antonio.raju_at_gmail.com> wrote:
> >> Hi,
> >>
> >> I have a object 'zoo':
> >>
> >> dim(zz)
> >> [1] 720 5551
> >>
> >> where some columns only have NA's values (representing land data in a
> >> sea surface temperature dataset) I find straightforward the use of
> >> 'na.approx' for individual columns from the zz matrix, but when applied
> >> to the whole matrix:
> >>
> >> zz.approx<-na.approx(zz)
> >> Erro en approx(along[!na], y[!na], along[na], ...) :
> >> need at least two non-NA values to interpolate
> >>
> >> The message is clear, but how do I could skip those 'full-NA's' columns
> >> from the interpolation in order to perform the analysis over the columns
> >> which represent actual data with some NA's values
> >>
> >> Best regards,
> >>
> >> Antonio
> >>
> >> --
> >> =====
> >> Por favor, si me mandas correos con copia a varias personas,
> >> pon mi dirección de correo en copia oculta (CCO), para evitar
> >> que acabe en montones de sitios, eliminando mi privacidad,
> >> favoreciendo la propagación de virus y la proliferación del SPAM.
> >> Gracias.
> >> -----
> >> If you send me e-mail which has also been sent to several other people,
> >> kindly mark my address as blind-carbon-copy (or BCC), to avoid its
> >> distribution, which affects my privacy, increases the likelihood of
> >> spreading viruses, and leads to more SPAM. Thanks.
> >> =====
> >> Antes de imprimir este e-mail piense bien si es necesario hacerlo: El
> >> medioambiente es cosa de todos.
> >> Before printing this email, assess if it is really needed.
> >>
> >> ______________________________________________
> >> R-help_at_stat.math.ethz.ch mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
>
>
> --
> =====
> Por favor, si me mandas correos con copia a varias personas,
> pon mi dirección de correo en copia oculta (CCO), para evitar
> que acabe en montones de sitios, eliminando mi privacidad,
> favoreciendo la propagación de virus y la proliferación del SPAM. Gracias.
> -----
> If you send me e-mail which has also been sent to several other people,
> kindly mark my address as blind-carbon-copy (or BCC), to avoid its
> distribution, which affects my privacy, increases the likelihood of
> spreading viruses, and leads to more SPAM. Thanks.
> =====
> Antes de imprimir este e-mail piense bien si es necesario hacerlo: El medioambiente es cosa de todos.
> Before printing this email, assess if it is really needed.
>
>



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 28 May 2007 - 12:17:52 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 28 May 2007 - 13:31:16 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.