Re: [R] Getting number of students with zeroes in long format

From: Christopher Desjardins <cddesjardins_at_gmail.com>
Date: Thu, 07 Apr 2011 08:07:11 -0500

Hi Jorge,
I want to make sure this does what I want.

So I want to get a count of students that never get a suspension. Once a student has a non-zero I don't want to count that student. Each id_r is may be associated with multiple sus. Are these commands doing this? Because ...

> suslm[175953:nrow(suslm),c("id_r","sus")]

           id_r sus

999881.5 999881   1
999881.6 999881   7
999881.7 999881   0
999881.8 999881   0
999886.5 999886   0
999886.6 999886   0
999886.7 999886   0
999886.8 999886   0
999890.5 999890   0
999890.6 999890   0
999890.7 999890   0
999890.8 999890   0
999892.5 999892   0
999892.6 999892   0
999892.7 999892   0
999892.8 999892   0
999896.5 999896   0
999896.6 999896   4
999896.7 999896   3
999896.8 999896   0
999897.5 999897   0
999897.6 999897   0
999897.7 999897   0

>
> tail(with(suslm,tapply(sus,id_r,function(x) any(x==0))))
999881 999886 999890 999892 999896 999897   TRUE TRUE TRUE TRUE TRUE TRUE
> r <- with(suslm, tapply(sus, id_r, function(x) any(x > 0))
> tail(with(suslm, tapply(sus, id_r, function(x) any(x > 0))))
999881 999886 999890 999892 999896 999897   TRUE FALSE FALSE FALSE TRUE FALSE Based on this 999881 and 999896 should be FALSE not TRUE

I would expect if they were true for the first command they should be false for the second command right?

> tail(names(r[ r == TRUE ]))

[1] "999752" "999767" "999806" "999807" "999881" "999896"
> tail(names(r[ r == FALSE ]))

[1] "999869" "999870" "999886" "999890" "999892" "999897"

This command seems to do the right thing. Is that right?

On Wed, Apr 6, 2011 at 10:25 PM, Jorge Ivan Velez <jorgeivanvelez_at_gmail.com>wrote:

> Hi Chris,
>
> Sorry I did not see your email before ;-) Here is one option:
>
> > r <- with(d, tapply(sus, id_r, function(x) any(x > 0)))
> > r
> 11 15 16 18 19 20 21 22 24 25 26 30
> 31 32
> FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
> FALSE FALSE
> 33
> FALSE
> > names(r[ r == TRUE ])
> [1] "15" "25"
>
> Regards,
> Jorge
>
>
> On Wed, Apr 6, 2011 at 5:03 PM, Christopher Desjardins <> wrote:
>
>> Thanks. And how many could I find that have greater than 0?
>> Chris
>>
>>
>> On Wed, Apr 6, 2011 at 3:58 PM, Jorge Ivan Velez <> wrote:
>>
>>> Hi Chris,
>>>
>>> Is this what you have in mind?
>>>
>>> > sum(with(yourdata, tapply(sus, id_r, function(x) any(x==0))))
>>> [1] 13
>>>
>>> HTH,
>>> Jorge
>>>
>>>
>>> On Wed, Apr 6, 2011 at 4:44 PM, Christopher Desjardins <> wrote:
>>>
>>>> Hi,
>>>> I have longitudinal school suspension data on students. I would like to
>>>> figure out how many students (id_r) have no suspensions (sus), i.e. have
>>>> a
>>>> code of '0'. My data is in long format and the first 20 records look
>>>> like
>>>> the following:
>>>>
>>>> > suslm[1:20,c(1,7)]
>>>> id_r sus
>>>> 11 0
>>>> 15 10
>>>> 16 0
>>>> 18 0
>>>> 19 0
>>>> 19 0
>>>> 20 0
>>>> 21 0
>>>> 21 0
>>>> 22 0
>>>> 24 0
>>>> 24 0
>>>> 25 3
>>>> 26 0
>>>> 26 0
>>>> 30 0
>>>> 30 0
>>>> 31 0
>>>> 32 0
>>>> 33 0
>>>>
>>>> Each id_r is unique and I'd like to know the number of id_r that have a
>>>> 0
>>>> for sus not the total number of 0. Does that make sense?
>>>> Thanks!
>>>> Chris
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help_at_r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>
>

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 07 Apr 2011 - 13:09:10 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 08 Apr 2011 - 15:15:29 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive