From: <uttam.phulwale_at_tcs.com>

Date: Thu 06 Oct 2005 - 14:11:52 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Oct 06 14:26:01 2005

Date: Thu 06 Oct 2005 - 14:11:52 EST

Hello Everybody,

I am reffering David Meyer's Benchmarking Support Vector Machines ,
Report No.78 (Nov.2002), i am newly working with R but i am not sure how
it is handling missing values in the benchmark datasets, I would be very
thankful to you if you could let me know how to handle those missing
numerical & categorical variables in the data (e.g. BreastCancer).

because, i am getting fewer predictions after trained model than the test observations for SVM, so could not calculate confusion matrix. At the same time, function lda(),fda() , rpart() did give the equal predictions. Then i m confused a lot, how these functions handled the missing values, are those missing values are imputed with mean, median or new category??

I have another problem with Generalized Linear Model (glm) function. I might have commited some error, but i am not sure where i did?

The script for glm function i have tried is as:

trdata<-data.frame(train,row.names=NULL) attach(trdata)

glmmod <- glm(Class~., family= binomial(link = "logit"),data=trdata,maxit=50)

tstdata<-data.frame(test,row.names=NULL) attach(tstdata)

xtst <- subset(tstdata, select = -Class) ytst <- Class

pred<-predict(glmmod,xtst)

library(mda)

confusion(pred,ytst)

can you help me to sort out the problems?

Uttam Phulwale

Tata Consultancy Services Limited

Mailto: uttam.phulwale@tcs.com

Website: http://www.tcs.com

[[alternative HTML version deleted]]

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Oct 06 14:26:01 2005

*
This archive was generated by hypermail 2.1.8
: Sun 23 Oct 2005 - 18:23:55 EST
*