Re: [R] randomForest outlier

From: Birgit Lemcke <birgit.lemcke_at_systbot.uzh.ch>
Date: Wed, 16 Jul 2008 15:26:11 +0200

I use a different dissimlarity measure (library(analogue);Gowers Index). I just wanted to look if there are similar values in both "tables".

I mainly try to find a way to find the best model to explain my predefined groups (using a bunch of different variables: factors,count,numeric, ordered factors)
I am also fiddling around with a logistic regression.

B.

Am 16.07.2008 um 14:58 schrieb Liaw, Andy:

> Note that I did say "by this measure": what you may want to
> consider as an outlier may not be what this measure picks out.
> After all, RF proximities are a bit unusual as a similarity measure.
>
>> -----Original Message-----
>> From: Birgit Lemcke [mailto:birgit.lemcke_at_systbot.uzh.ch]
>> Sent: Wednesday, July 16, 2008 8:55 AM
>> To: Liaw, Andy
>> Cc: R Hilfe
>> Subject: Re: [R] randomForest outlier
>>
>> Thanks anyway for your answer.
>> That was also an option that I took into account (no potential
>> outliers) and I will have a look at the "value" section of ?outliers.
>>
>> B.
>>
>> Am 16.07.2008 um 14:11 schrieb Liaw, Andy:
>>
>>> Perhaps if you follow the posting guide more closely, you
>> might get
>>> more
>>> (useful) replies, but without looking at your data, I doubt
>> there's
>>> much
>>> anyone can do for you.
>>>
>>> The fact that the range of the outlying measures is -1 to 2
>> would tell
>>> me there are no potential outliers by this measure. Please see the
>>> "value" section of ?outlier to see how this measure is computed.
>>>
>>> Andy
>>>
>>> From: Birgitle
>>>>
>>>> Still the same question:
>>>>
>>>>
>>>> Birgitle wrote:
>>>>>
>>>>> I try to use ?randomForest to find variables that are the
>>>> most important
>>>>> to divide my dataset (continuous, categorical variables) in
>>>> two given
>>>>> groups.
>>>>>
>>>>> But when I plot the outlier:
>>>>>
>>>>> plot(outlier(rfObject, cls=groupingVariable),
>>>>> type="p",col=c("red","green")[as.numeric(groupingVariable)])
>>>>>
>>>>> it seems to me that all my values appear as outliers.
>>>>> Has anybody suggestions what is going wrong in my analysis?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> Additonal remark
>>>> The scaling of the y-axis is quite small between -1 and 2.
>>>>
>>>>
>>>> -----
>>>> The art of living is more like wrestling than dancing.
>>>> (Marcus Aurelius)
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/randomForest-outlier-tp17979182p18466832.html
>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>
>>>> ______________________________________________
>>>> R-help_at_r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>> Notice: This e-mail message, together with any
>> attachments, contains
>>> information of Merck & Co., Inc. (One Merck Drive,
>> Whitehouse Station,
>>> New Jersey, USA 08889), and/or its affiliates (which may be known
>>> outside the United States as Merck Frosst, Merck Sharp & Dohme or
>>> MSD and in Japan, as Banyu - direct contact information for
>>> affiliates is
>>> available at http://www.merck.com/contact/contacts.html) that may be
>>> confidential, proprietary copyrighted and/or legally
>> privileged. It is
>>> intended solely for the use of the individual or entity
>> named on this
>>> message. If you are not the intended recipient, and have
>> received this
>>> message in error, please notify us immediately by reply e-mail and
>>> then delete it from your system.
>>>
>>
>> ===========================
>> Birgit Lemcke
>> Institut of Systematic Botany
>> University of Zurich
>> Zollikerstrasse 107
>> CH-8008 Zürich
>> Switzerland
>> Ph: +41 (0)44 634 8351
>> mail: birgit.lemcke_at_systbot.uzh.ch
>> ===========================
>>
>>
>>
>>
>>
>>
>>
> Notice: This e-mail message, together with any attachments, contains
> information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
> New Jersey, USA 08889), and/or its affiliates (which may be known
> outside the United States as Merck Frosst, Merck Sharp & Dohme or
> MSD and in Japan, as Banyu - direct contact information for
> affiliates is
> available at http://www.merck.com/contact/contacts.html) that may be
> confidential, proprietary copyrighted and/or legally privileged. It is
> intended solely for the use of the individual or entity named on this
> message. If you are not the intended recipient, and have received this
> message in error, please notify us immediately by reply e-mail and
> then delete it from your system.
>



Birgit Lemcke
Institut of Systematic Botany
University of Zurich
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
mail: birgit.lemcke_at_systbot.uzh.ch


R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 16 Jul 2008 - 13:29:04 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 16 Jul 2008 - 13:31:53 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive