From: lalitha viswanath <lalithaviswanath_at_yahoo.com>

Date: Fri 05 May 2006 - 05:19:49 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri May 05 05:24:33 2006

Date: Fri 05 May 2006 - 05:19:49 EST

Hi

I am trying to plot an x-y plot of the values a
certain variable against bins.

i.e. the x-axiz goes from 0 to 0.7 in increments of
0.02 while the y-axis is the average of values for all
the points in that interval.

Hence I first used cut to break the data into intervals, then I applied tapply using mean as the function and plotted the results.

I also replaced mean with median.

the 3 sets of functions that I used were

However I am finding that the actual value plotted in the y-axis somehow does not seem to be correct?

i.e. for example in the interval 0.38-0.4 there are a
humungous number of points with y-axis value below 20
while there are very few with y-axis value above 20.
However the median plotted is still around the 20
mark.

It does not seem intuitive looking at the data that
more than 50% of the points have a clock_rate (plotted
on the y-axis) above 20.

Is there something about the way these functions work
with tapply, that I am missing?

Any obvious mistakes that I should look for?

SWfac <-cut(sorted_inp$age[1:290], seq(0, 0.7,0.02))
SLmean <- tapply(sorted_inp$clock_rate[1:290], SWfac,
mean)

plot(SLmean, type ="b", xaxt = "n")

axis(1, seq(SLmean), levels(SWfac))

I tried a simple x-y scatter plot of the same 290 rows in excel (without binning them) and the concentration of points at lower values of clock rates does not seem to indicate that the medians should be as high as they are shown.

Hoping to hear further

Regards

Lalitha

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri May 05 05:24:33 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Fri 05 May 2006 - 06:09:59 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*