Re: [R] Discriminant function analysis

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Thu, 7 Feb 2008 16:24:11 +0000 (GMT)

On Thu, 7 Feb 2008, Tyler Smith wrote:

> On 2008-02-07, Birgit Lemcke <birgit.lemcke@systbot.uzh.ch> wrote:
>>
>> Am 06.02.2008 um 21:00 schrieb Tyler Smith:
>>>
>>>> My dataset contains variables of the classes factor and numeric. Is
>>>> there another function that is able to handle this?
>>>
>>> The numeric variables are fine. The factor variables may have to be
>>> recoded into dummy binary variables, I'm not sure if lda() will deal
>>> with them properly otherwise.
>>
>> But arenīt binary variables also factors? Or is there another
>> variable class than factor or numeric?
>> Do I have have to set the classe of the binaries as numeric?
>>
>
> There is no binary class in R, so you would have to use a numeric
> field. For example:

Then what do you consider the logical type to be?

(Strictly it is not binary because of NAs, but it is used for binary variables in model formulae.)

>
> | sample | factor_1 |
> |--------+----------|
> | A | red |
> | B | green |
> | C | blue |
>
> becomes:
>
> | sample | dummy_1 | dummy_2 |
> |--------+---------+---------|
> | A | 1 | 0 |
> | B | 0 | 1 |
> | C | 0 | 0 |
>
> R can deal with dummy_1 and dummy_2 as numeric vectors. The details
> should be explained in a good reference on multivariate statistics
> (I'm looking at Legendre and Legendre (1998) section 1.5.7 and 11.5).

The issue is rather a statistical one: the theory behind LDA assumes continuous variables, indeed a multivariate normal distribution. You can apply LDA to binary explanatory variables, but there are much more appropriate methods (as indeed there are for factor explanatory variables).

> HTH,
>
> Tyler
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

Received on Thu 07 Feb 2008 - 16:28:02 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 07 Feb 2008 - 17:30:12 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive