Re: [R] Count of rows while looping through data

From: jim holtman <jholtman_at_gmail.com>
Date: Fri, 27 May 2011 15:45:52 -0400

When you subset, the factors will carry along all the original levels.  You can remove them in your processing by:

x$fac <- factor(x$fac)

> x <- data.frame(fam=c('a','a','b'), grp=c('1','2','3'))
> # split
> x.s <- split(x, x$fam)
> # notice additional levels
> str(x.s$b)

'data.frame': 1 obs. of 2 variables:
 $ fam: Factor w/ 2 levels "a","b": 2
 $ grp: Factor w/ 3 levels "1","2","3": 3
>
> z <- x.s$b
> str(z)

'data.frame': 1 obs. of 2 variables:
 $ fam: Factor w/ 2 levels "a","b": 2
 $ grp: Factor w/ 3 levels "1","2","3": 3
> z$grp <- factor(z$grp) # remove extra levels
> str(z)

'data.frame': 1 obs. of 2 variables:
 $ fam: Factor w/ 2 levels "a","b": 2
 $ grp: Factor w/ 1 level "3": 1
>

On Fri, May 27, 2011 at 12:20 PM, Jeanna <stroutj_at_uw.edu> wrote:
> I may have prematurely excited...
>
> I ended up using the split method since my family indicators are
> alphanumeric so my issue is as follows.
>
> I'm applying this to different subsets of my main data set.  The subsets do
> not contain all families.  When I run the method on one of my subsets I get
> back a table that includes ALL the families.  Those that weren't in the
> subset to which I applied the method have <NA> for all of the fields.
>
> If I export one of the subsets, restart R (to be certain nothing of my
> original playtime is left) and import only the subset, the method works
> perfectly.
>
> The addition of the previously removed rows seems to happen at the 'split'
> step.
>
> Is there something I'm doing incorrectly?  I can't seem to figure out how to
> convince R not to look at my original data frame when deciding how many
> families there are.
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Count-of-rows-while-looping-through-data-tp3547949p3555752.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 27 May 2011 - 19:48:40 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 27 May 2011 - 19:50:11 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive