[R] linear discriminant analysis

From: Research Scholar <thesis1977_at_gmail.com>
Date: Thu, 06 Mar 2008 21:35:25 -0500


Dear R help list,

I have a training dataset that looks like Table1. I have an unknown dataset that looks like Table2. I want to have a program that should search the training dataset and identify that the unknown sample belongs to which category (type1, type2 or type3)
and also if the unknown does not belong to any of the categories, it should let me know.
The real dataset has 600 variables and 50 sample types.

I tried working with linear discriminant analysis (lda in MASS package) and its predict function. It works great but I think lda is supposed to categorize unknown into one of the types. Most of my unknowns would not be from any category in the training dataset. I don't want to have false positive identification.

Table 1: Three types and 10 variables

    type1 type1 type1 type2 type2 type2 type3 type3 type3
var1 24 28 25 50 51 46 18 20 16 var2 4 5 4 9 8 9 10 9 10 var3 7 7 7 12 12 12 9 6 6 var4 4 5 4 10 12 9 2 2 2 var5 4 5 4 10 9 10 3 2 3 var6 5 4 5 2 3 2 1 3 5 var7 5 4 5 7 7 7 3 3 3 var8 3 4 3 10 10 8 4 2 4 var9 3 4 3 2 2 2 2 2 2 var10 3 3 3 4 4 4 3 1 2

Table 2

    unknown
var1 23
var2 4
var3 7
var4 4
var5 4
var6 6
var7 5
var8 3
var9 3
var10 3

Thanks

RS

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 07 Mar 2008 - 02:37:47 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 07 Mar 2008 - 03:30:20 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive