Re: [R] question on lm or glm matrix of coeficients X test data terms

From: DS <ds5j_at_excite.com>
Date: Tue, 08 Jul 2008 19:33:19 -0400 (EDT)

Hi,

  I found some of what I was looking for.

using the following I can get a matrix of regression coefficient multiplied out by the variable data.

g<-predict(comodel,type='terms',data4)

m<-cbind(data4,g)   

What remains is how do I pick the 3-4 rows for each data row with the highest values?

I need to get the column names of the top 3 coefficients from this matrix.

Some looping through for each row and pick the top 3 highest coefficient/variable products and then getting the columns names for these 3.

is there an easy way to get this in an R function?

thanks

Dhruv

From: Jorge Ivan Velez [mailto: jorgeivanvelez_at_gmail.com]

To: ds5j_at_excite.com

Date: Mon, 7 Jul 2008 21:42:54 -0400

Subject: Re: [R] question on lm or glm matrix of coeficients X test data terms

That's R: you come out with solutions every time. I hope don't bother you with this. Try also:# data set (10 rows, 10 columns)set.seed(123)X=matrix(rpois(100,10),ncol=10)# Function to estimate your outcome

outcome=function(x,betas){if(length(x)!=length(betas)) stop("x and beta have different lengths!")y=x*betassum(y)}# let's assume that you want to include x1, x4, x7 and x9 only# by using beta1=0.5, beta4=0.6, beta7=-0.1, beta9=0.3

betas=c(0.5,0,0,0.6,0,0,-0.1,0,0.3,0)# Resultsapply(X,1,outcome, betas=betas)HTH,JorgeOn Mon, Jul 7, 2008 at 9:31 PM, Jorge Ivan Velez <jorgeivanvelez_at_gmail.com> wrote:

Sorry, I forgot to the the sum over the rows:# data set (10 rows, 10 columns)

set.seed(123)X=matrix(rpois(100,10),ncol=10)# Function to estimate your outcomeoutcome=function(x,betas){if(length(x)!=length(betas)) stop("x and beta have different lengths!")

y=x*betasy}# let's assume that you want to include x1, x4, x7 and x9 only# by using beta1=0.5, beta4=0.6, beta7=-0.1, beta9=0.3betas=c(0.5,0,0,0.6,0,0,-0.1,0,0.3,0)

# Resultsapply(t(apply(X,1,outcome, betas=betas)),1,sum)

HTH,JorgeOn Mon, Jul 7, 2008 at 9:23 PM, Jorge Ivan Velez <jorgeivanvelez_at_gmail.com> wrote:

Dear Dhruv,It's me again. I've been thinking about a little bit. If you want to include/exclude variables to estimate your outcome, you could try something like this:# data set (10 rows, 10 columns)

set.seed(123)X=matrix(rpois(100,10),ncol=10)# Function to estimate your outcomeoutcome=function(x,betas){if(length(x)!=length(betas)) stop("x and beta have different lengths!")

y=x*betasy}# let's assume that you want to include x1, x4, x7 and x9 only# by using beta1=0.5, beta4=0.6, beta7=-0.1, beta9=0.3betas=c(0.5,0,0,0.6,0,0,-0.1,0,0.3,0)# Resultst(apply(X,1,outcome, betas=betas))

HTH,JorgeOn Mon, Jul 7, 2008 at 9:11 PM, Jorge Ivan Velez <jorgeivanvelez_at_gmail.com> wrote:

Dear Dhruv,The short answer is not, because the function I built doesn't work for more variables than coefficients (see the "stop" I introduced). You should do some modifications such as coefficients equals to 1 or 0. For example:

# data set (10 rows, 10 columns)set.seed(123)X=matrix(rpois(100,10),ncol=10)X# Function to estimate your outcomeoutcome=function(x,betas,val){k=length(x)nb=length(betas)

if(length(x)!=length(betas)) betas=c(betas, rep(val,k-nb))

y=x*betasy}# beta1=1, beta2=2, the rest is equal to zerot(apply(X,1,outcome,betas=c(1,2),val=0))# beta1=0.5, beta2=0.6, the rest is equal to 1

t(apply(X,1,outcome,betas=c(1,2),val=1))

HTH,JorgeOn Mon, Jul 7, 2008 at 8:57 PM, DS <ds5j_at_excite.com> wrote:

thanks Jorge. I appreciate your quick help.

Will this work if I have 20 columns of data but my regression only has 5 variables?

I am looking for something generic where I can give it my model and test data and get back a vector of the multiplied coefficients (with no hard coding). When predict is called with an input model and data, R must be multiplying all co-efficients times variables and summing the number but is there a way to get components of the regressiom terms stored in a matrix before they are added?

The idea is to build n models with various terms and after producing a prediction list the top 3 variables that had the biggest impact in that particular set of predictor values.

e.g. if I build a model to predict default of loans I would then need to list the top factors in the model that can be used to explain why the loan is risky. With 10-16 variables which can be present or not for each case there be a different 2 or 3 variables that led to the said prediction.

Dhruv

From: Jorge Ivan Velez [mailto: jorgeivanvelez_at_gmail.com]

To: ds5j_at_excite.com

Date: Mon, 7 Jul 2008 20:12:53 -0400

Subject: Re: [R] question on lm or glm matrix of coeficients X test data terms

Dear Dhruv,Try also:# data setset.seed(123)X=matrix(rpois(10,10),ncol=2)# Function to estimate your outcomeoutcome=function(x,betas){if(length(x)!=length(betas)) stop("x and betas are of different length!")

y=x*betasy}# outcome for beta1=0.05 and beta2=0.6t(apply(X,1,outcome,betas=c(0.05,0.6)))# outcome for beta1=5 and beta2=6

t(apply(X,1,outcome,betas=c(5,6)))

HTH,JorgeOn Mon, Jul 7, 2008 at 7:56 PM, DS <ds5j_at_excite.com> wrote:

Hi,

  is there an easy way to get the calculated weights in a regression equation?

for e.g.

if my model has 2 variables 1 and 2 with coefficient .05 and .6

how can I get the computed values for a test dataset for each coefficient?

data

var1,var2

10,100

so I want to get .5, 60 back in a vector. This is a one row example but I would want to get a matrix of multiplied out coefficients and terms for use in comparing contribution of variables to final score. As in a scorecard using logistic regression.

Please advise.

thanks

Dhruv


R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 08 Jul 2008 - 23:38:27 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 09 Jul 2008 - 00:31:57 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive