Re: [R] Generate a serie of new vars that correlate withexistingvar

From: Olivier ETERRADOSSI <olivier.eterradossi_at_ema.fr>
Date: Fri 06 Apr 2007 - 08:03:48 GMT

Hello Greg (and List),
Thnaks for your reply and reflections (and sorry for my "frenglish"....). Of course you're right, and I agree "a posteriori" with all your views. Probably my suggestion was first of all a mark of appreciation for your solution ;-) .
Here is the path I followed to get where I was, but I see that I was probably misunderstanding what makes the "core" of R : 1) The question of making such related couples of vectors is nearly a FAQ, as you point out in your reply.
2) It appeared to me that it is often asked by newbies or users with relatively small statistical knowledge.
3) To get to your solution, a good understanding is needed of what correlation is, as well as of matrix properties and operators. My guess was that the people listed above have generally not. 4) I believed from my own experience that the core of R was dedicated either to basics or to rather complicated algorithms to handle or produce results appearing as "simple" or "classical". 5) From my same own experience, I was not able to imagine to which non-core package such a function should "obviously" be added. I imagined that in the same manner, a person seeking for the function could have some problems in locating it. Until now I did not have a look to your TeachingDemos package (I'll do it), but I know of other categories of searchers, often not statisticians, who have a need to generate such data and would not think of getting there to find a way. To end with, all this mainly shows that I did not understand R philosophy as well as I thought !
Thanks, and regards. Olivier

Greg Snow a écrit :
> Oliver,
>
> I have thought of adding something like this to a package, but here is my current thinking on the issue.
>
> This question (or similar) has been asked a few times, so there is some demand for a general answer, I see three approaches:
>
> 1. Have an example of the necessary steps archived in a publicly available place.
> 2. Write a function and include it in a non-core package.
> 3. Add it to the core of R or a core package.
>
> Number 1 is already in process as the e-mails will be part of the archive. Though someone is welcome to add it to the Wiki if they think that would be useful as well.
>
> Your suggestion is number 3, but I would argue that 2 is better than 3 for the simple reason that anything added to the core is implied to be top quality and have pretty much any options that most people would think of. Putting it in a non-core package makes it available, with less implications of quality.
>
> The question then becomes, what options do we make available? Do we have them specify the entire correlation structure? Or just assume the new variables will be independent of each other? What should the function do if the set of correlations result in a matrix that is not positive definite? What if the user wants to have 2 fixed variables? And other questions.
>
> My current thinking is that the process is simple enough that it is easier to do this by hand than to remember all the options to the function. There are currently people who use bootstrap and permutation tests without loading in the packages that do these because it is quicker to write the code by hand than to remember the syntax of the functions. I think this type of data generation falls under the same situation. But if you, or someone else thinks that there is enough justification for a function to do this, and can specify what options it should have, I will be happy to add it to my TeachingDemos package (this seems an appropriate place, since one of the places that I want to generate data with a specific correlation structure is when creating an example for students).
>
>
> Hope this helps,
>
>

-- 
Olivier ETERRADOSSI
Maître-Assistant
CMGD / Equipe "Propriétés Psycho-Sensorielles des Matériaux"
Ecole des Mines d'Alès
Hélioparc, 2 av. P. Angot, F-64053 PAU CEDEX 9
tel std: +33 (0)5.59.30.54.25
tel direct: +33 (0)5.59.30.90.35 
fax: +33 (0)5.59.30.63.68
http://www.ema.fr

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri Apr 06 18:10:37 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 09 Apr 2007 - 15:30:56 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.