Re: [R] Need ideas on how to show spikes in my data and how to code it in R

From: Thomas Fröjd <tfrojd_at_gmail.com>
Date: Tue, 08 Jul 2008 17:41:30 +0200

Hi thanks for your answer.

> ii) hist() will not show the same frequencies as density() unless hist has unit bin sizes. density*length is showing number per unit change in Weight; hist shows number per bin width.

I belive this is what is confounding me. I have a bin width of 0.1 in the histogram. Changing

dens$y <- dens$y * (length(weights$Weight))

to

dens$y <- dens$y * (length(weights$Weight)*binwidth)

where binwidth=0.1 seems to output correct graphs.

Can someone verify this is the right approach?

On Tue, Jul 8, 2008 at 1:45 PM, S Ellison <S.Ellison_at_lgc.co.uk> wrote:
> Two thoughts:
> i) If you have a narrow distribution, the density can be higher than 1. The area comes out at 1 for density, and n for the frequency.
>
> ii) hist() will not show the same frequencies as density() unless hist has unit bin sizes. density*length is showing number per unit change in Weight; hist shows number per bin width.
>
>
>
>
>
>
> Try plotting a histogram first, then plot the density on top of that. If they disagree >>> "Thomas Fröjd" <tfrojd_at_gmail.com> 07/08/08 12:29 PM >>>
> Hi!
>
> Sorry for bothering you again but I can't seem to get it right.
>
> When i multiply the density with the number of observations it seems
> to be way to high, The reference curve is drawn at maybe 20 times
> higher frequency count than it should be.
>
> I use the following code where "weights$Weight" is my weights data and
> "reference" is my reference dataset.
>
> # calucate the right breakpoints
> breakpoints <- seq(min(weights), max(weights), by=binwidth)
>
>
> #scale density
> dens <- density(reference)
> dens$y <- dens$y * (length(weights$Weight))
>
> #graph it
> hist(weights$Weight, freq=TRUE, breaks=breakpoints, main=wfiles[i])
>
> lines(dens)
>
> Any ideas are greatly appreciated.
>
> /Thomas
>
>
>
> On Fri, Jun 27, 2008 at 10:54 PM, Daniel Folkinshteyn
> <dfolkins_at_gmail.com> wrote:
>> if you want the "frequency" scale rather than density scale, then leave hist
>> as is (by default it uses the frequency scale), and rescale the density by
>> multiplying it by the appropriate NOBS.
>>
>> on 06/27/2008 01:16 PM Thomas Frööjd said the following:
>>>
>>> Hi
>>>
>>> Thank you very much for taking time to answer.
>>>
>>> The solution of using hist(data) for the main dataset and adding
>>> lines(density(refdata)) for the reference data seem to work great. I
>>> forgot to mention one thing however, I need to have frequency on the y
>>> azis instead of density as now.
>>>
>>> I know this is not a "real" histogram but since the audience is not
>>> very statistically experienced I would prefer to do it this way.
>>> Anyone have an idea?
>>>
>>> Thanks again for your help.
>>>
>>> Thomas Fröjd
>>>
>>> On Wed, Jun 25, 2008 at 6:16 PM, Daniel Folkinshteyn <dfolkins_at_gmail.com>
>>> wrote:
>>>>>
>>>>> I don't understand this. Why not just get hist() to plot on the
>>>>> density scale,
>>>>> thereby making its output commensurate with the output of density()?
>>>>> The hist() function will plot on the density scale if you ask it to.
>>>>> Set freq=FALSE
>>>>> (or prob=TRUE) in the call to hist.
>>>>
>>>> ehrm... because i didn't realize that option existed :) that certainly is
>>>> easier than manually scaling hist output by NOBS - thanks for the tip!
>>>>
>>>
>>
>>
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> *******************************************************************
> This email and any attachments are confidential. Any u...{{dropped:8}}



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 08 Jul 2008 - 15:45:12 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 08 Jul 2008 - 16:32:05 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive