Re: [R] How do I normalize a PSD?

From: Keith Chamberlain <Keith.Chamberlain_at_colorado.edu>
Date: Thu 02 Feb 2006 - 15:14:50 EST


Dear Tom,

Short answer, if your using spec.pgram(), use the smoothing kernel to get a better estimate at the frequency centered in the bandwidth. If your frequency bin of interest is wider than the bandwidth of the kernel, average across frequencies (I think). The estimate appears to be normalized already.

If you are calculating your PSD independently, then oversample (e.g. 2, perhaps 4 or more times the highest frequency you need to characterize, and sum (not average) across the frequency bin to get the power estimate of the center frequency in that bin. Shove the whole signal into memory and don't window it if you can. The oversampling improves your estimate and reduces the variance. Getting the FFT of the whole signal at once eliminates edge effects that would occur if you segmented the data. Whatever your PSD is normalized to, the sum across frequencies in your bin will maintain that property.

If you are really wanting to see the PSD plotted by period rather than frequency, plot the inverse of your frequency values.

The normalization part is a separate (and huge, perhaps neglected, perhaps overrated) subject. Here is how I have tried to make sense out of it.

I do not have years and years of experience in spectral analysis, but perhaps you may find this a more simple (minded?) description. I was taught the definition of the power spectrum using "Numerical Recipes" from Press, et al (e.g. any of 1988, 1992, and 2002, for FORTRAN, C, and C++ respectively). Their description seems different (as Leif suggested) from other sources. They also happen to caution (strongly) about the distinction between "one-sided" and "two-sided" PSDs, but I'm not sure to what extent other authors have concerned themselves with that point, or that it is even necessary in most practical applications (warning: non-expert opinion).

In the definition I learned, a periodogram is considered "normalized" in the sense that the sum across the frequency range has something to do with some quantity (Sum Squared, Mean Squared, etc.) of a signal in the time domain.

Let
N = Number of samples
Fc = Nyquist critical frequency
C = Normalization parameter

P2(f) = C * Mod(fft(signal))^2 from -fc to fc: Press et al call a "two-sided" estimate, even if just the power at positive frequencies is returned.

P1(f) = C * Mod(fft(signal))^2 from DC to fc, but all frequencies excluding DC and Nyquist are multiplied by 2. Press et al. consider this the "one-sided" estimate.

The reason they call those "one" or "two" sided is that the sum of the power over all of the frequencies will be equivalent even though the one sided estimate has only half the data points.
Sum(P2) == sum(P1)

The parameter C is what I think I understand as the normalization parameter, and besides the one/two sided nomenclature issue, is where authors seem to differ. Regardless of which definition of the PSD is used, where C = 1/N, the periodogram (to Press, et al.) would be normed to the Sum Squared of the signal in the time domain. I believe some authors might call that form un-normalized, depending on whether or not 1/N was included in the definition of a periodogram that they happen to have been working from. Where C = 1/N^2, the periodogram would be normed to the Mean Squared of the signal in the time-domain. Where C = some other factor, the periodogram is normed to the equivalent of that factor from the time-domain. That's my (perhaps oversimplified) understanding.

>From Venables & Ripley, listed as a source on >?spec.pgram, the definition
used for that function seems comparable - though I wouldn't count out the possibility that I missed or misread something. The factor C in that case is 1/N so the PSD estimate should sum to the Sum Squared in the time domain. There are 2 caveats. The first is a point that Leif pointed out, that the periodogram from spec.pgram() is normalized to frequency. I need to reread that section in Venables & Ripley and look at the link provided in another recent reply in more detail to be certain. For now, I am making the assumption that the parameter would be used 'in addition' to the 1/N factor. I highly recommend that you reference the sources from the spec.pgram help page.

Second, smoothing and tapering are defined separately from the PSD, so I do not think the equalities hold up as well between a quantity of the (unsmoothed) time-domain signal, and its normalized (smoothed &/or tapered) power spectrum.

I hope this helps, wasn't to long, and that my weak editing skills did not make to grand an appearance.

Regards,
KeithC.

Message: 30
Date: Tue, 31 Jan 2006 13:02:03 -0800
From: Leif Kirschenbaum <leif@reflectivity.com> Subject: Re: [R] How do I "normalise" a power spectral density To: r-help@stat.math.ethz.ch, Tom C Cameron <bgytcc@leeds.ac.uk> Message-ID: <200601312102.k0VL29fS003509@hypatia.math.ethz.ch> Content-Type: text/plain; charset=iso-8859-1

I have done a fair bit of spectral analysis, and hadn't finished collecting my thoughts for a reply, so hadn't replied yet.

What exactly do you mean by normalize?
  I have not used the functons periodogram or spectrum, however from the description for periodogram it appears that it returns the spectral density, which is already normalized by frequency, so you don't have to worry about changing the appearance of your periodogram or power spectrum if you change the time intervals of your data. With a normal Fourier Transform, not only do you need to complex square the terms but you also need to divide by a normalizing factor to give power-per-frequency-bin, which the periodogram appears to do.

  However, if you look in various textbooks, the definition of the Fourier Transform (FT) varies from author to author in the magnitude of the prepended scaling factor. Since the periodogram is related to the FT (periodogram ultimately uses the function mvfft), without examination of the code for periodogram you cannot know the scaling factor, which is almost always one of the following:
  1.0
  1/2
  1/(2*pi)
  SQRT(1/2)
  SQRT(1/(2*pi))
  In fact, if you obtain an FT (or FFT or DFT) from a piece of electronics (say an electronic spectrum analyzer), the prepending factor can vary from manufacturer to manufacturer.
  Fortunately, there is a strict relationship between the variance of your signal and the integrated spectral density. If your time signal is x(t), the spectral density is S(f), and fc = frequency(x)/2 the Nyquist cutoff frequency, then this may be expressed as:

  variance(x(t)) = constant * {Integral from 0 to fc of S(f)}

In R-code: let x be your time series, and "constant" be the unknown scaling factor (1/2, 1/2pi, etc.)
  p <- periodogram(x)
  var(x) == constant * sum(p[[1]])/length(p[[1]])

Or:
  constant = sum(p[[1]])/length(p[[1]])/var(x)

and we find that the appropriate scaling constant is 1.0.

  As regards plotting versus period, the periodogram returns arrays of spectral amplitude and frequencies. The frequencies are in inverse units of the intervals of your time series. i.e. if your time series is 1-point per day, then the frequencies are in 1/day units. Thus if you wish to plot amplitude versus period in weeks you have a little math to do.   I believe that plotting is usually versus frequency since most observers are interested in how things vary versus frequency: multiple evenly spaced peaks on a linear frequency scale indicates the presence of harmonics; this is not so simply seen in a plot versus period. ex. peaks of 10 Hz, 20 Hz, 30 Hz, 40 Hz,... in period would be at periods of 100 ms, 50 ms, 25 ms, 12.5 ms, and the peaks are not evenly spaced.   Additionally, there are all kinds of typical responses versus frequency (1/f, 1/f^2) which are clearly seen in plots versus frequency as straight lines (log power vs log frequency), but would come out as curves in plots versus period.
  I can see how ecological studies may indeed be more interested in the periods.

  However, I would be wary of using the periodogram function, for if I calculate periodograms of the same sinewave but for differing lengths of the sample, the spectral density does not come out the same. All 4 of the plots produced by the code below should overlay, and yet as the time series becomes longer there appears to be an increasing offset of the magnitudes returned. (black-brown-blue-red)

x0<-ts(data=sin(2*pi*1.1*(1:1000)/10),frequency=10) p0<-periodogram(x0)
var(x0)

x1<-ts(data=sin(2*pi*1.1*(1:10000)/10),frequency=10) p1<-periodogram(x1)
var(x1)

x2<-ts(data=sin(2*pi*1.1*(1:100000)/10),frequency=10) p2<-periodogram(x2)
var(x2)

x3<-ts(data=sin(2*pi*1.1*(1:1000000)/10),frequency=10) p3<-periodogram(x3)
var(x3)

plot(p3[[2]],p3[[1]],col="red",type="p",pch=19,cex=0.05,log="y",

        xlim=c(0.1,0.12),ylim=c(1e-30,1e-15))

points(p2[[2]],p2[[1]],type="l",col="blue")
points(p1[[2]],p1[[1]],type="l",col="brown")
points(p0[[2]],p0[[1]],type="o",col="black")


> Message: 113
> Date: Mon, 30 Jan 2006 17:45:58 -0800
> From: Spencer Graves <spencer.graves@pdf.com>
> Subject: Re: [R] How do I "normalise" a power spectral density
> analysis?
> To: Tom C Cameron <bgytcc@leeds.ac.uk>
> Cc: r-help@stat.math.ethz.ch
> Message-ID: <43DEC156.2090900@pdf.com>
> Content-Type: text/plain; charset=us-ascii; format=flowed
>
> Since I have not seen a reply to this post, I will
> offer a comment,
> even though I have not used spectral analysis myself and
> therefore have
> you intuition about it. First, from the definitions I read in the
> results from, e.g., RSiteSearch("time series power spectral density")
> [e.g.,
> http://finzi.psych.upenn.edu/R/library/GeneTS/html/periodogram
> .html] and
> "spectral analysis" in Venables and Ripley (2002) Modern Applied
> Statistics with S (Springer), I see no reason why you
> couldn't plot the
> spectrum vs. the period rather than the frequency. Someone else may
> help us understand why it is usually plotted vs. the frequency; I'd
> guess that the standard plot looks more like the integrand in the
> standard Fourier inversion formula, but I'm not sure.
>
> If you'd like more help from this listserve, you
> might briefly
> describe the problem you are trying to solve, why you think spectral
> analysis analysis should help, and include a toy example with some
> self-contained R code to illustrate what you tried and what you don't
> understand about it. (And PLEASE do read the posting guide!
> "www.R-project.org/posting-guide.html". Nothing is certain but
> following that posting guide will, I believe, tend to
> increase the speed
> and utility of response.)
>
> hope this helps.
> spencer graves
>
> Tom C Cameron wrote:
>
> > Hi everyone
> >
> > Can anyone tell me how I normalise a power spectral density
> (PSD) plot of a
> > periodical time-series. At present I get the graphical
> output of spectrum VS
> > frequency.
> >
> > What I want to acheive is period VS spectrum? Are these the
> same things but the
> > x-axis scale needs transformed ?
> >
> > Any help would be greatly appreciated
> >
> > Tom
> >
> ..............................................................
> .............
> > Dr Tom C Cameron office: 0113 34
> 32837 (10.23 Miall)
> > Ecology & Evolution Res. Group. lab: 0113 34 32884
> (10.20 Miall)
> > School of Biological Sciences Mobile: 07966160266
> > University of Leeds email:
> t.c.cameron@leeds.ac.uk
> > Leeds LS2 9JT
> > LS2 9JT
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Feb 02 15:22:31 2006

This archive was generated by hypermail 2.1.8 : Thu 02 Feb 2006 - 20:22:26 EST