Re: [R] aggregate() function, strange behavior for augmented data

From: David Afshartous <dafshartous_at_med.miami.edu>
Date: Mon, 16 Jun 2008 11:50:05 -0400

Everything was read in the same way, and str(junk1) confirms that they are the same structure. This is very strange.

## original data:
> str(junk1)
'data.frame': 96 obs. of 3 variables:

 $ Hour: int  0 3 5 0 3 5 0 3 5 0 ...
 $ Drug: Factor w/ 2 levels "D","P": 2 2 2 1 1 1 2 2 2 1 ...
 $ Aldo: int  9 15 4 8 13 3 5 11 5 7 ...

## augmented data:
> str(junk1)
'data.frame': 108 obs. of 3 variables:

 $ Hour: int  0 3 5 0 3 5 0 3 5 0 ...
 $ Drug: Factor w/ 2 levels "D","P": 2 2 2 1 1 1 2 2 2 1 ...
 $ Aldo: int  9 15 4 8 13 3 5 11 5 7 ...






On 6/16/08 11:37 AM, "markleeds_at_verizon.net" <markleeds_at_verizon.net> wrote:

> 
> hi: do str(junk1) and it will tell you what  the components of junk1
> are.
> 
> the only thing i can think of is that you used stringsAsFactors=FALSE
> when you ( probably ) used read.table to read in junk but you didn't use
> that
> options when you used read.table  to read in junk1 ?
> 
> 
> On Mon, Jun 16, 2008 at 11:30 AM, David Afshartous wrote:
> 

>> All,
>>
>> I'm re-running some analysis that has been augmented with additional
>> data.
>> When I use the exact same code for the augmented data, the behavior of
>> the
>> aggregate function is very strange, viz., one of the resulting
>> variables is
>> now coded as a factor while it was coded as numeric for the original
>> data.
>> Unfortunately, I cannot provide a reproducible code example since it
>> only
>> seems to occur with this data. I've checked and re-checked the of
>> both the
>> original and augmented data but nothing appears inconsistent. Any
>> suggestions much appreciated. See below for specifics.
>>
>> Cheers,
>> David
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> # original data
>>> dim(junk1)

>> [1] 96 3
>>> junk1[1,]

>> Hour Drug Aldo
>> 1 0 P 9
>>> junk1$Hour

>> [1] 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5
>> 0 3
>> 5 0 3
>> [39] 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3
>> 5 0
>> 3 5 0
>> [77] 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 ### Not coded as a
>> factor
>>> junk1.mean.time.drug = aggregate(junk1[3], junk1[c(1,2)], mean)
>>> junk1.mean.time.drug$Hour

>> [1] 0 3 5 0 3 5 ### not coded as a factor
>>
>> # augmented data
>> dim(junk1)
>> [1] 108 3
>>> junk1[1,]

>> Hour Drug Aldo
>> 1 0 P 9
>>> junk1$Hour

>> [1] 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3
>> 5 0 3
>> 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3
>> [51] 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0
>> 3 5 0
>> 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0
>> [101] 3 5 0 3 5 0 3 5 ### not coded as a factor
>>> junk1.mean.time.drug = aggregate(junk1[3], junk1[c(1,2)], mean)
>>> junk1.mean.time.drug$Hour

>> [1] 0 3 5 0 3 5
>> Levels: 0 3 5 ################## coded as a factor now!
>>
>> ## of course, I get recode it again but I'm curious as to why this is
>> ## changing here
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.


R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 16 Jun 2008 - 17:09:54 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 16 Jun 2008 - 17:30:43 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive