Re: [R] subscripting in data frames with NA

From: Peter Dalgaard <P.Dalgaard_at_biostat.ku.dk>
Date: Tue, 24 Jun 2008 13:11:37 +0200

Agustin Lobo wrote:
> Dear list:
>
> Given
> > str(b3)
> 'data.frame': 159 obs. of 6 variables:
> $ index_pollution : num 8.228 10.513 0.549 0.915 10.416 ...
> $ position_descrip: chr "2" "2" "2" NA ...
> $ position_geo : chr "3" "0" "3" "3" ...
> $ institution : Factor w/ 3 levels "digesa","mem",..: 3 3 3 3 3 3
> 3 3 3 3 ...
> $ p_desc_no3 : chr "2" "2" "2" NA ...
> $ p_geo_no3 : chr "3" "0" "3" "3" ...
>
> I try to subscript but get:
>
> > b3[b3[,3]=="3",5] <-NA
> Error in `[<-.data.frame`(`*tmp*`, b3[, 3] == "3", 5, value = NA) :
> missing values are not allowed in subscripted assignments of data
> frames
Notice that it is not the NA on the right that is the problem, but those in the subscript, so try

b3[b3[,3]=="3" | is.na(b3[,3]), 5] <- NA

(or ... &!is.na... if that is what you want)
> Why? What's the correct way of doing this operation?
I forget the exact reason, but as far as I remember, we allowed it at some point, but found that behaviour was inconsistent between differnt modes of subassignment.

> Actually, I previously tried with:
> > str(b2)
> 'data.frame': 159 obs. of 6 variables:
> $ index_pollution : num 8.228 10.513 0.549 0.915 10.416 ...
> $ position_descrip: Factor w/ 3 levels "0","1","2": 3 3 3 NA NA NA 3
> 3 3 3 ...
> $ position_geo : Factor w/ 4 levels "0","1","2","3": 4 1 4 4 3 NA
> 3 3 3 4 ...
> $ institution : Factor w/ 3 levels "digesa","mem",..: 3 3 3 3 3 3
> 3 3 3 3 ...
> $ p_desc_no3 : Factor w/ 3 levels "0","1","2": 3 3 3 NA NA NA 3
> 3 3 3 ...
> $ p_geo_no3 : Factor w/ 4 levels "0","1","2","3": 4 1 4 4 3 NA
> 3 3 3 4 ...
>
> > table(b2$p_desc_no3)
>
> 0 1 2
> 42 44 66
>
> and
>
> > levels(b2$p_desc_no3)[levels(b2$position_geo)=="3"] <- NA
>
> which does not result into error but leaves b2$p_desc_no3 unchanged:
>
I don't think this makes sense at all. It changes the 4th level of a three-level factor???
> > table(b2$p_desc_no3)
>
> 0 1 2
> 42 44 66
>
>
> what am i doing wrong?
>
> Thanks
>
> Agus
>
>
>

-- 
   O__  ---- Peter Dalgaard             Ă˜ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard_at_biostat.ku.dk)              FAX: (+45) 35327907

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 24 Jun 2008 - 11:14:50 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 24 Jun 2008 - 11:30:49 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive