From: Douglas Bates

Date: Wed, 06 Apr 2011 16:03:24 -0500

On Wed, Apr 6, 2011 at 3:44 PM, Christopher Desjardins
<cddesjardins_at_gmail.com> wrote:

Hi,
*

I have longitudinal school suspension data on students. I would like to

figure out how many students (id_r) have no suspensions (sus), i.e. have a
code of '0'. My data is in long format and the first 20 records look like
the following:
**>
> suslm[1:20,c(1,7)]
id_r sus
11 0
15 10
16 0
18 0
19 0
19 0
20 0
21 0
21 0
22 0
24 0
24 0
25 3
26 0
26 0
30 0
30 0
31 0
32 0
33 0
**>
Each id_r is unique and I'd like to know the number of id_r that have a 0
for sus not the total number of 0. Does that make sense?
*

You say you have longitudinal data so may we assum that a particular id_r can occur multiple times in the data set? It is not clear to me what you want the result to be for students who have no suspensions at one time but may have a suspension at another time. Are you interested in the number of students who have only zeros in the sus column?

One way to approach this task is to use tapply. I would create a data frame and convert id_r to a factor.

df <- within(as.data.frame(suslm), id_r <- factor(id_r)) counts <- with(df, lapply(sus, id_r, function(sus) all(sus == 0)))

The tapply function will split the vector sus according to the levels of id_r and apply the function to the subvectors.

