From: The r newbie Fred <thernewbiefred_at_rocketmail.com>

Date: Mon, 07 Mar 2011 01:03:10 -0800 (PST)

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 07 Mar 2011 - 09:41:16 GMT

Date: Mon, 07 Mar 2011 01:03:10 -0800 (PST)

Hello everyone !

I am currently trying to convert a program from S-plus to R, and I am having some trouble with the S-plus function called "influence(data, statistic,...)".

This function aims to "calculate empirical influence values and related
quantities",

and is part of the Resample library that I cannot find for R.
However, 2 similar functions are available in R:

- the lm.influence(model, ...) function,
- the empinf(data, statistic,...)" function.

I didn't manage to use the lm.influence() function correctly, because it needs a
linear model

as input (lm, glm), and what I have as input is a function (I don't know well
R/S-plus languages,

so I may be mistaken, but I believe lm.influence() is not what I should use for
my problem ...?)

I have tried to use the R empinf() instead but I am stucked with a problem
concerning the

input argument "group" that I cannot translate in R.

Here is a copy of the S-plus "influence()" help concerning this argument: group : vector of length equal to the number of observations in data, for stratified sampling or

multiple-sample problems. Sampling is done separately for each group (determined by unique values

of this vector). If data is a data frame, this may be a variable in the data frame, or expression

involving such variables.

empinf() accepts an argument called "strata" but it doesn't seem to correspond to "group".

Below is a sample test showing my problem:

"testinflu" = function(data, weights) { sum(data[,1]*weights) } mydata <- cbind(c(1,2,3,4,5), c(1,1,1,1,0))

# In S-plus :

> testinflu(data=mydata, weights=rep(1,length(mydata[,1])))
15

# In R:

> testinflu(data=mydata, weights=rep(1,length(mydata[,1])))
15

# In S-plus :

> influence(data = mydata, statistic=testinflu)$L

testinflu

*[1,] -2.000000e+000
**[2,] -1.000000e+000
**[3,] -1.776357e-013
*

[4,] 1.000000e+000

[5,] 2.000000e+000

# In R :

> empinf(data = mydata, statistic=testinflu)

[1] -2.000000e+00 -1.000000e+00 2.220446e-12 1.000000e+00 2.000000e+00

# ==> OK

# In S-plus :

> influence(data = mydata, statistic=testinflu, group = mydata[, 2])$L

testinflu

[1,] -1.2[2,] -0.4[3,] 0.4

[4,] 1.2

[5,] 0.0

# In R:

> empinf(data = mydata, statistic=testinflu, strata = mydata[, 2])

[1] -1.5 -0.5 0.5 1.5 0.0

**# ==> NOT OK
**
So I have a few questions:

- has anyone already experienced the same kind of problem with the influence
function ?

- is it possible to mimic the use of the "group" argument in empinf() ?

I have looked for answers on the web but couldn't find anythings really helpful, so if someone has an idea I would really appreciate it !! :)

Thanks,

Fred

ps : sorry for my broken English ...

[[alternative HTML version deleted]] ______________________________________________R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 07 Mar 2011 - 09:41:16 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Mon 07 Mar 2011 - 12:40:19 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*