# Re: [R] a correlation matrix subset where the subset avg is a maximum

From: Ryan Austin <austin_at_botany.utoronto.ca>
Date: Fri 13 Oct 2006 - 21:33:33 GMT

Thanks for the thought in any case Mark. Your right about the brute force. I'll expand a bit with an example though for the sake of clarity.

Given a correlation matrix of 4 covariates ABCD with distances of: AB=0.2; AC=0.6; AD=0.3 ; BC=0.9 ; BD=0.8 ; CD=0.7

Find the optimal subset (size > n, n being the number of covariates) where the mean of r for the subset is a maximum. Of course all NxN distances need to be considered between any chosen subset covariates.

Thus for n>1, the solution would be simply BC = 0.9 And for n>2, the solution would be BCD as (BC + CD + BD)/3) = 0.8 is the maximum mean r value that could be obtained from any of the subsets with n>2.

I'd expected that this would be a common problem but 2 days of googling has given me little. I'm expecting a greedy graph traversal or the like will be my answer but I'd hoped to whip a solution of in R. Any help would be greatly appreciated.
Ryan

Leeds, Mark (IED) wrote:

>hi ryan : I reread and you already have the correlation matrix so brute
>force should definitely work.
>
>So, if the correlation matrix was size 20 by 20 and your n was 9.
>
>Then, you have to have of size 10 or greater so the number of
>possoibilities would be ( 20 choose 10 ) + ( 20 choose 11 ) + ( 200
>choose 12 ) + ( 20 choose 13 ) + ......... ( 20 choose 20 )
>
>Oh boy, it is too large a problem to do by brute force. There are too
>many possibilities even for this size of problem.
>Hopefully Someone else will have a better idea. Forget my brute force
>idea. It's useless and I apologize. I Made a mistake.
>
>
>
>
>
>
>-----Original Message-----
>From: r-help-bounces@stat.math.ethz.ch
>[mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Ryan Austin
>Sent: Friday, October 13, 2006 2:43 PM
>To: r-help@stat.math.ethz.ch
>Subject: [R] a correlation matrix subset where the subset avg is a
>maximum
>
>Hello R group,
>
>Given a correlation matrix, I would like to obtain the best subset of
>pairs in the matrix of some size > n such that the mean of r for that
>subset is a maximum compared to any other possible subset of size > n.
>I've been looking at the deal and subselect packages but they don't seem
>to do what I need. Does anyone have any suggestions?
>
>Ryan
>
>______________________________________________
>R-help@stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>--------------------------------------------------------
>
>This is not an offer (or solicitation of an offer) to buy/sell the securities/instruments mentioned or an official confirmation. Morgan Stanley may deal as principal in or own or act as market maker for securities/instruments mentioned or may advise the issuers. This is not research and is not from MS Research but it may refer to a research analyst/research report. Unless indicated, these views are the author's and may differ from those of Morgan Stanley research or others in the Firm. We do not represent this is accurate or complete and we may not update this. Past performance is not indicative of future returns. For additional information, research reports and important disclosures, contact me or see https://secure.ms.com/servlet/cls. You should not use e-mail to request, authorize or effect the purchase or sale of any security or instrument, to send transfer instructions, or to effect any other transactions. We cannot guarantee that any such requests received via!
e-mail will be processed in a timely manner. This communication is solely for the addressee(s) and may contain confidential information. We do not waive confidentiality by mistransmission. Contact me if you do not wish to receive these communications. In the UK, this communication is directed in the UK to those persons who are market counterparties or intermediate customers (as defined in the UK Financial Services Authority's rules).
>
>

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat Oct 14 07:45:19 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Fri 13 Oct 2006 - 22:30:11 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.