[R] Analyzing large transition matrix

From: Bill Harris <bill_harris_at_facilitatedsystems.com>
Date: Wed, 23 Jun 2010 06:30:50 -0700


Let's say you have a dataframe of car trade-ins. For example, each row contains

oldcar newcar qty

and a typical entry could be

lexus bmw 1

I put the qty column to allow for fleet purchases, where one purchase may convert multiple cars at once.

I'd like to show what's going on. I could do a histogram of newcar to show the frequency each type of car is bought. If there are 5-10 car types, that works. If there are 50-100 or more, the legend gets illegible.

I could also do a histogram of oldcar to see what people gave up, but that's less interesting.

I'm considering a correlogram using the corrgram package, but a heat map might work, too. Any tips on making the legends useful in any of this? Any better approaches to try?

I tried table() and prop.table() to see if I could get transition probabilities as if this were a Markov chain, but dim() comes out 108 78, which is still too big to print or visualize.

Suggestions?

Thanks,

Bill

-- 
Bill Harris                  http://makingsense.facilitatedsystems.com/
Facilitated Systems                              Everett, WA 98208 USA
http://www.facilitatedsystems.com/               phone: +1 425 374-1845

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 23 Jun 2010 - 13:33:24 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 24 Jun 2010 - 12:00:35 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive