From: Adaikalavan Ramasamy <ramasamy_at_cancer.org.uk>

Date: Thu 08 Jul 2004 - 04:36:05 EST

R-help@stat.math.ethz.ch mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jul 08 04:39:52 2004

Date: Thu 08 Jul 2004 - 04:36:05 EST

On Wed, 2004-07-07 at 18:29, Bret Collier wrote:

> R-users,

*> I have been using R for about 1 year, and I have run across a
**> couple of graphics problem that I am not quite sure how to address. I have
**> read up on the email threads regarding the differences between density and
**> relative frequencies (count/sum(count) on the R list, and I am hoping that
**> someone could provide me with some advice/comments concerning my
**> approach. I will admit that some of the underlying mathematics of the
**> density discussion are beyond my current understanding, but I am looking
**> into it.
**>
**> I have a data set (600,000 obs) used to parameterize a probabilistic causal
**> model where each obs is a population response for one of 2 classes (either
**> regs1 and regs2). I have been attempting to create 1 marginal probability
**> plot with 2 lines (one for each class). Using my rather rough code, I
**> created a plot that seems to adhere to the commonly used (although from
**> what I can understand wrong) relative frequency histogram approach.
**>
**> My rough code looks like this:
**>
**> bk <- c(0, .05, .1, .15, .2, .25,.3, .35, 1)
**> par(mfrow=c(1, 1))
**> fawn1 <- hist(MFAWNRESID[regs1], plot=F, breaks=bk)
**> fawn2 <- hist(MFAWNRESID[regs2], plot=F, breaks=bk)
**> count1 <- fawn1$counts/sum(fawn1$counts)
**> count2 <- fawn2$counts/sum(fawn2$counts)
**> b <- c(0, .05, .1, .15, .2, .25, .3, .35)
**> plot(count1~b,xaxt="n", xlim=c(0, .5), ylim=c(0, .40), pch=".", bty="l")
**> lines(spline(count1~b), lty=c(1), lwd=c(2), col="black")
**> lines(spline(count2~b), lty=c(2), lwd=c(2), col="black")
**> axis(side=1, at=c(0, .05, .1, .15, .2, .25, .3, .35))
*

Have you considered density() and plot.density() by any change ?

> Using the above, I get frequency values for regs1 that look like this

*> (which is the same as output for my probabilistic model):
**> > count1
**> [1] 1.213378e-01 3.454324e-01 3.365343e-01 1.580839e-01 3.342101e-02
**> [6] 4.698426e-03 4.488942e-04 4.322685e-05
*

I would tend to use the term proportion rather than frequency.

> First, count1 is the frequency of occurrence within range 0-0.05, but when

*> plotted is the value at b=0 and does not really represent the range? Are
**> there any suggestions on a technique to approach this?
*

You can plot it in the mid-points like hist() does. fawn1$mids would give you these values.

> Next: Using the above code, the x-axis values end at 0.35, but the axis

*> continues (because bk ends at 1)? While there is the chance of occurrence
**> out past .35, it is low and I want to extend the lines to about .35 and
**> clip the x-axis. But, I have been unable to figure out how to clip Could
**> someone point me in the correct direction?
*

In your plot() function, set xlim=c(0,0.35). If you mean 'clipping' as in truncating the density, then you probably need to do re-adjust your proportions such that they sum up to 1.

R-help@stat.math.ethz.ch mailing list

https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jul 08 04:39:52 2004

*
This archive was generated by hypermail 2.1.8
: Wed 03 Nov 2004 - 22:54:45 EST
*