Re: [R] row-wise conditional update in dataframe

From: Jon Erik Ween <jween_at_klaru-baycrest.on.ca>
Date: Mon, 21 Jan 2008 21:10:58 -0500

Thanks Jim

That got me there. I suppose R prefers absolute field references in scripts rather than macrosubstitutions of field names like you would do in pearl or shell scripts?

Anyway, thanks for you help.

Cheers

Jon

Soli Deo Gloria

Jon Erik Ween, MD, MS
Scientist, Kunin-Lunenfeld Applied Research Unit Director, Stroke Clinic, Brain Health Clinic

     Baycrest Centre for Geriatric Care
Assistant Professor, Dept. of Medicine, Div. of Neurology

     University of Toronto Faculty of Medicine

Posluns Building, 6th Floor, Room 644
Baycrest Centre for Geriatric Care
3560 Bathurst Street
Toronto, Ontario M6A 2E1
Canada

Phone: 416-785-2500 x3636
Fax: 416-785-2484
Email: jween_at_klaru-baycrest.on.ca

Confidential: This communication and any attachment(s) may contain confidential or privileged information and is intended solely for the address(es) or the entity representing the recipient(s). If you have received this information in error, you are hereby advised to destroy the document and any attachment(s), make no copies of same and inform the sender immediately of the error. Any unauthorized use or disclosure of this information is strictly prohibited.

On 21-Jan-08, at 8:57 PM, jim holtman wrote:

> If you only want a subset, then use that in the function:
>
> Dataset.target <- apply(x,1,function(.row) sum(is.na(.row[3:8])))
>
> This will put it back in column1:
>
>> x <- matrix(1,10,10)
>> x[sample(1:100,10)] <- NA
>> x[,1] <- 0 # make sure column 1 has no NAs so sums are correct
>> x
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> [1,] 0 1 NA 1 1 1 1 NA 1 1
> [2,] 0 1 1 1 1 1 1 1 1 1
> [3,] 0 1 1 1 1 1 1 1 1 NA
> [4,] 0 1 1 1 1 1 1 NA 1 1
> [5,] 0 1 1 NA 1 1 1 1 1 1
> [6,] 0 1 1 1 1 1 1 1 1 1
> [7,] 0 1 1 1 1 1 1 1 1 1
> [8,] 0 NA 1 NA NA 1 NA 1 1 NA
> [9,] 0 1 1 1 1 1 1 1 1 1
> [10,] 0 1 1 1 1 1 1 1 1 1
>> # get the sums of NA in 3:8 and put in column 1
>> x[,1] <- apply(x, 1, function(.row) sum(is.na(.row[3:8])))
>>
>>
>> x
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> [1,] 2 1 NA 1 1 1 1 NA 1 1
> [2,] 0 1 1 1 1 1 1 1 1 1
> [3,] 0 1 1 1 1 1 1 1 1 NA
> [4,] 1 1 1 1 1 1 1 NA 1 1
> [5,] 1 1 1 NA 1 1 1 1 1 1
> [6,] 0 1 1 1 1 1 1 1 1 1
> [7,] 0 1 1 1 1 1 1 1 1 1
> [8,] 3 NA 1 NA NA 1 NA 1 1 NA
> [9,] 0 1 1 1 1 1 1 1 1 1
> [10,] 0 1 1 1 1 1 1 1 1 1
>>
>
>
> On Jan 21, 2008 8:47 PM, Jon Erik Ween <jween_at_klaru-baycrest.on.ca>
> wrote:
>> Thanks Jim
>>
>> I see how this works. Problem is, I need to interrogate only a subset
>> of fields. In your example, I need to put the total number of "NA"
>> fields out of fields 3..8, excluding 1,2 9 10. Also, I don't see how
>> the method inserts the sum into a particular field in a row. I guess
>> you could do
>>
>> Dataset.target <- apply(x,1,function(.row) sum(is.na(.row)))
>>
>> Thanks
>>
>> Jon
>>
>>
>> Soli Deo Gloria
>>
>> Jon Erik Ween, MD, MS
>> Scientist, Kunin-Lunenfeld Applied Research Unit
>> Director, Stroke Clinic, Brain Health Clinic
>> Baycrest Centre for Geriatric Care
>> Assistant Professor, Dept. of Medicine, Div. of Neurology
>> University of Toronto Faculty of Medicine
>>
>> Posluns Building, 6th Floor, Room 644
>> Baycrest Centre for Geriatric Care
>> 3560 Bathurst Street
>> Toronto, Ontario M6A 2E1
>> Canada
>>
>> Phone: 416-785-2500 x3636
>> Fax: 416-785-2484
>> Email: jween_at_klaru-baycrest.on.ca
>>
>>
>> Confidential: This communication and any attachment(s) may contain
>> confidential or privileged information and is intended solely for the
>> address(es) or the entity representing the recipient(s). If you have
>> received this information in error, you are hereby advised to destroy
>> the document and any attachment(s), make no copies of same and inform
>> the sender immediately of the error. Any unauthorized use or
>> disclosure of this information is strictly prohibited.
>>
>>
>>
>> On 21-Jan-08, at 8:28 PM, jim holtman wrote:
>>
>>> You need to do 'is.na(x)' instead of "x == NA".. Here is a way of
>>> doing it:
>>>
>>>> x <- matrix(1,10,10)
>>>> x[sample(1:100,10)] <- NA
>>>> x
>>> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>>> [1,] 1 1 1 1 1 1 1 1 1 1
>>> [2,] 1 1 1 1 1 1 NA 1 1 1
>>> [3,] 1 1 1 1 1 1 1 1 1 1
>>> [4,] 1 1 1 1 1 1 1 1 1 1
>>> [5,] 1 1 1 1 1 1 1 1 1 1
>>> [6,] NA 1 1 1 1 1 1 1 NA 1
>>> [7,] 1 1 NA NA 1 NA 1 1 1 NA
>>> [8,] 1 1 1 1 1 NA 1 1 1 1
>>> [9,] 1 1 1 1 1 1 1 1 NA 1
>>> [10,] 1 NA 1 1 1 1 1 1 1 1
>>>>
>>>> apply(x,1,function(.row) sum(is.na(.row)))
>>> [1] 0 1 0 0 0 2 4 1 1 1
>>>>
>>>
>>>
>>> On Jan 21, 2008 7:23 PM, Jon Erik Ween <jween_at_klaru-baycrest.on.ca>
>>> wrote:
>>>> Hi!
>>>>
>>>> I need to conditionally update a dataframe field based on values in
>>>> other fields and can't find even how to search for this right.
>>>> Sorry
>>>> if this has been asked before.
>>>>
>>>> But, specifically, I have a 490 X 221 dataframe and need to
>>>> count, by
>>>> row, how many fields in Dataframe$field_a...Dataframe$field_zz are
>>>> non-null and enter this value in Dataset$ABCtaskNum. I have field
>>>> name definitions in a vector "vars" and tried writing a custom
>>>> function to handle the within-row calculation
>>>>
>>>> myfunct <-function () {for (i in 1:length(vars)) {if (vars[i] !=
>>>> NA)
>>>> {Dataset$ABCtaskNum<-Dataset$ABCtaskNum+1}}}
>>>>
>>>> and then use "apply" to handle the row to row calculation
>>>>
>>>> Dataset <- apply(Dataset, 1, myfunc) Where Dataset already has
>>>> field
>>>> Dataset$ABCtaskNum set to 0 in all rows.
>>>>
>>>> But that didn't work. Doesn't help if I declare variables (vars and
>>>> ABCtaskNum) in the function declaration either, but then I haven't
>>>> quite figured out how best to do variable substitutions in R.
>>>>
>>>> Thanks for any help. Cheers
>>>>
>>>> Jon
>>>>
>>>> Soli Deo Gloria
>>>>
>>>> Jon Erik Ween, MD, MS
>>>> Scientist, Kunin-Lunenfeld Applied Research Unit
>>>> Director, Stroke Clinic, Brain Health Clinic
>>>> Baycrest Centre for Geriatric Care
>>>> Assistant Professor, Dept. of Medicine, Div. of Neurology
>>>> University of Toronto Faculty of Medicine
>>>>
>>>> Posluns Building, 6th Floor, Room 644
>>>> Baycrest Centre for Geriatric Care
>>>> 3560 Bathurst Street
>>>> Toronto, Ontario M6A 2E1
>>>> Canada
>>>>
>>>> Phone: 416-785-2500 x3636
>>>> Fax: 416-785-2484
>>>> Email: jween_at_klaru-baycrest.on.ca
>>>>
>>>>
>>>> Confidential: This communication and any attachment(s) may contain
>>>> confidential or privileged information and is intended solely
>>>> for the
>>>> address(es) or the entity representing the recipient(s). If you
>>>> have
>>>> received this information in error, you are hereby advised to
>>>> destroy
>>>> the document and any attachment(s), make no copies of same and
>>>> inform
>>>> the sender immediately of the error. Any unauthorized use or
>>>> disclosure of this information is strictly prohibited.
>>>>
>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help_at_r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>>> guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>> --
>>> Jim Holtman
>>> Cincinnati, OH
>>> +1 513 646 9390
>>>
>>> What is the problem you are trying to solve?
>>>
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 22 Jan 2008 - 02:14:18 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 22 Jan 2008 - 08:30:08 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive