Re: [R] Getting number of students with zeroes in long format

From: Dennis Murphy <djmuser_at_gmail.com>
Date: Wed, 06 Apr 2011 19:22:49 -0700

Hi:

Another approach would be to use xtabs(). Letting df represent your example data frame,

(u <- with(df, xtabs(sus ~ id_r)))
# IDs returned as a character vector:
> names(u)[u == 0L]

 [1] "11" "16" "18" "19" "20" "21" "22" "24" "26" "30" "31" "32" "33"

HTH,
Dennis

On Wed, Apr 6, 2011 at 2:10 PM, Christopher Desjardins < cddesjardins_at_gmail.com> wrote:

> On Wed, Apr 6, 2011 at 4:03 PM, Douglas Bates <bates@stat.wisc.edu> wrote:
>
> > On Wed, Apr 6, 2011 at 3:44 PM, Christopher Desjardins
> > <cddesjardins_at_gmail.com> wrote:
> > > Hi,
> > > I have longitudinal school suspension data on students. I would like to
> > > figure out how many students (id_r) have no suspensions (sus), i.e.
> have
> > a
> > > code of '0'. My data is in long format and the first 20 records look
> like
> > > the following:
> > >
> > >> suslm[1:20,c(1,7)]
> > > id_r sus
> > > 11 0
> > > 15 10
> > > 16 0
> > > 18 0
> > > 19 0
> > > 19 0
> > > 20 0
> > > 21 0
> > > 21 0
> > > 22 0
> > > 24 0
> > > 24 0
> > > 25 3
> > > 26 0
> > > 26 0
> > > 30 0
> > > 30 0
> > > 31 0
> > > 32 0
> > > 33 0
> > >
> > > Each id_r is unique and I'd like to know the number of id_r that have a
> 0
> > > for sus not the total number of 0. Does that make sense?
> >
> > You say you have longitudinal data so may we assum that a particular
> > id_r can occur multiple times in the data set?
>
>
> Yes an id_r can occur multiple times in the data set.
>
>
> > It is not clear to me
> > what you want the result to be for students who have no suspensions at
> > one time but may have a suspension at another time. Are you
> > interested in the number of students who have only zeros in the sus
> > column?
> >
>
> Yes. Once a student has a value other than zero I don't want to include
> that
> student in the tally. So I want to know how many students never got
> suspended during the study.
>
>
> >
> > One way to approach this task is to use tapply. I would create a data
> > frame and convert id_r to a factor.
> >
> > df <- within(as.data.frame(suslm), id_r <- factor(id_r))
> > counts <- with(df, lapply(sus, id_r, function(sus) all(sus == 0)))
> >
>
>
> I am getting the following message:
>
> > df <- within(as.data.frame(suslm), id_r <- factor(id_r))
> > counts <- with(df, lapply(sus, id_r, function(sus) all(sus == 0)))
> Error in get(as.character(FUN), mode = "function", envir = envir) :
> object 'id_r' of mode 'function' was not found
>
>
> Thanks,
> Chris
>
>
> > The tapply function will split the vector sus according to the levels
> > of id_r and apply the function to the subvectors.
> >
> > I just say Jorge's response and he uses the same tactic but he is
> > looking for students who had any value of sus==0
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 07 Apr 2011 - 02:29:41 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 07 Apr 2011 - 04:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive