From: David James <djames_at_frontierassoc.com>

Date: Sat 27 Aug 2005 - 06:59:45 EST

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Aug 27 07:14:01 2005

Date: Sat 27 Aug 2005 - 06:59:45 EST

What is the quickest way to create many categorical variables
(factors) from continuous variables?

This is the approach that I have used:

# create sample data

N <- 20

x <- runif(N,0,1)

# setup ranges to define categories

x.a <- (x >= 0.0) & (x < 0.4) x.b <- (x >= 0.4) & (x < 0.5) x.c <- (x >= 0.5) & (x < 0.6) x.d <- (x >= 0.6) & (x < 1.0)

# create factors

i <- runif(N,1,1)

x.new <- (i*1*x.a) + (i*2*x.b) + (i*3*x.c) + (i*4*x.d)
x.factor <- factor(x.new)

I'm looking for a better / simpler / more elegant / more robust (as the number of categories increases) way to do this. I also don't like that my factor names can only be numbers in this example. I would prefer a solution to take a form like the following (inspired by the "hist" function):

# define breakpoints

x.breaks = c(0, 0.4, 0.5, 0.6, 1.0) x.factornames = c( "0 - 0.4", "0.4 - 0.5", "0.5 - 0.6", "0.6 - 1.0" ) x.factor = unknown.function( x, x.breaks, x.factornames )

Thanks,

David

P.S. Here's what I have read to try to find the answer to my problem:

* "Introductory Statistics with R" * "A Brief Guide to R for Beginners in Econometrics" * "Econometrics in R" ______________________________________________R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Aug 27 07:14:01 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:39:56 EST
*