# Re: [R] Need a more efficient way to implement this type of logic in R

From: Joshua Wiley <jwiley.psych_at_gmail.com>
Date: Wed, 06 Apr 2011 13:49:40 -0700

Hi Walter,

Take a look at the function ?cut. It is designed to take a continuous variable and categorize it, and will be much simpler and faster. The only qualification is that your data would need to be numeric, not character. However, if your only values are the ones you put in quotes in your code ('02' etc), a simple call to as.numeric(variablename) ought to do the trick. Beyond being faster, you can probably get down to one line of code, which should be much easier on the eyes. To see some examples with cut(), type (at the console):

example(cut)

Hope this helps,

Josh

P.S. If you are planning on doing any modelling with this data, why not leave it continuous?

On Wed, Apr 6, 2011 at 1:02 PM, Walter Anderson <wandrson01_at_gmail.com> wrote:
>  I have cobbled together the following logic.  It works but is very slow.
>  I'm sure that there must be a better r-specific way to implement this kind
> of thing, but have been unable to find/understand one.  Any help would be
> appreciated.
>
> hh.sub <- households[c("HOUSEID","HHFAMINC")]
> for (indx in 1:length(hh.sub\$HOUSEID)) {
>  if ((hh.sub\$HHFAMINC[indx] == '01') | (hh.sub\$HHFAMINC[indx] == '02') |
> (hh.sub\$HHFAMINC[indx] == '03') | (hh.sub\$HHFAMINC[indx] == '04') |
> (hh.sub\$HHFAMINC[indx] == '05'))
>    hh.sub\$CS_FAMINC[indx] <- 1 # Less than \$25,000
>  if ((hh.sub\$HHFAMINC[indx] == '06') | (hh.sub\$HHFAMINC[indx] == '07') |
> (hh.sub\$HHFAMINC[indx] == '08') | (hh.sub\$HHFAMINC[indx] == '09') |
> (hh.sub\$HHFAMINC[indx] == '10'))
>    hh.sub\$CS_FAMINC[indx] <- 2 # \$25,000 to \$50,000
>  if ((hh.sub\$HHFAMINC[indx] == '11') | (hh.sub\$HHFAMINC[indx] == '12') |
> (hh.sub\$HHFAMINC[indx] == '13') | (hh.sub\$HHFAMINC[indx] == '14') |
> (hh.sub\$HHFAMINC[indx] == '15'))
>    hh.sub\$CS_FAMINC[indx] <- 3 # \$50,000 to \$75,000
>  if ((hh.sub\$HHFAMINC[indx] == '16') | (hh.sub\$HHFAMINC[indx] == '17'))
>    hh.sub\$CS_FAMINC[indx] <- 4 # \$75,000 to \$100,000
>  if ((hh.sub\$HHFAMINC[indx] == '18'))
>    hh.sub\$CS_FAMINC[indx] <- 5 # More than \$100,000
>  if ((hh.sub\$HHFAMINC[indx] == '-7') | (hh.sub\$HHFAMINC[indx] == '-8') |
> (hh.sub\$HHFAMINC[indx] == '-9'))
>    hh.sub\$CS_FAMINC[indx] = 0
> }
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> and provide commented, minimal, self-contained, reproducible code.
>

--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help