[R] index question

From: Bob Green <bgreen_at_dyson.brisnet.org.au>
Date: Fri, 28 Dec 2007 22:24:28 +1000

I was hoping for some advice regarding indexing,

 From a dataframe there are 27 variables of interest, with the prefix of "pre".

  [7] "Decision"  "MHCDate"   "pre01"     "pre01111"  "pre012"    "pre013"
[13] "pre02"     "pre02111"  "pre02114"  "pre0211"   "pre0212"   "pre029"
[19] "pre03a"    "pre0311"   "pre0312"   "pre03"     "pre04"     "pre05"
[25] "pre06"     "pre07"     "pre08"     "pre09"     "pre10"     "pre11"
[31] "pre12"     "pre13"     "pre14"     "pre15"     "pre16"

I want to combine these variables into new variables, using the following criteria :

(1) create a single variable PRE, when any of the 27 'pre' variables have a value >= '1'
(2) create a variable HOM, when any of the pre01, pre01111, pre012, pre013 variables have a value >= '1'
(3) create a variable ASS, when any of the pre02, pre02111, pre02114, pre0211, pre0212, pre029 variables have a value >= '1' (4) create a variable SEX, when any of the pre03a, pre0311, pre0312, pre03 variables have a value >= '1'
(5) create a variable VIO, when any of the pre01 to pre06 variables have a value >= '1'
(6) create a variable SERASS. If pre02111 or pre2114 >= '1', assign a value of 1, if there is a value of 1 or greater for pre0211 assign a value of 2; & if there is a value of
1 or greater for pre0212: assign a value of 3; if there is a value of 1 or greater for pre2029 assign a value of 4; everything else = 0. If a case has multiple values, 02111 prevails over 2114, 2114 prevails over 0211, 0211 prevails over 0212; 0212 prevails over 2029.

I believe I can generate new variables (1) - (5) using code such as: ASS <- (reoffend$pre02 | reoffend$pre02111 | reoffend$pre02114 | reoffend$pre0211 | reoffend$pre0212 | reoffend$pre029 >= '1')

I have three questions:

  1. If this is correct, what is the most efficient way to generate (1) without having to type all the variable names. The following does not work: PRE <- reoffend [,9:35], >= '1'
  2. I am unsure as to how to generate Example 6.
  3. I wanted to exclude cases with a reoffend$Decision of value of 3, using the code below. However, I received a message saying there were NAs produced, however, the raw variable did not have NAs.

> MHT.decision <- reoffend[reoffend$Decision >= '2',]
> table(MHT.decision)

Error in vector("integer", length) : vector size cannot be NA In addition: Warning messages:
1: NAs produced by integer overflow in: pd * (as.integer(cat) - 1L)
2: NAs produced by integer overflow in: pd * nl

> table(reoffend$Decision)

    1 2 3
1136 445 66

Any assistance is much appreciated,

Bob Green



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 28 Dec 2007 - 12:22:36 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 28 Dec 2007 - 14:00:20 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.