I have a set of bivariate data: one variable (vegetation type) which is categorical, and one (computed annual insolation) which is continuous. Plotting veg_type ~ insolation produces a nice overview of the patterns that I can see in the source data. However, due to the large number of samples (1,000), and the apparent "spread" in the distribution of a single vegetation type over a range of insolation values- I having a hard time quantitatively describing the relationship between the two variables.

Since the data along each vegetation type "line" is not a distribution in the traditional sense, I am having problems applying descriptive statistical methods. Conceptually, I would like to some how describe the variation with insolation, along each vegetation type "line".

Any guidance, or suggested reading material would be greatly appreciated.

