# [R] A model for possibly periodic data with varying amplitude [repost, much edited]

From: Andrew Robinson <A.Robinson_at_ms.unimelb.edu.au>
Date: Wed 16 Aug 2006 - 07:39:57 EST

Hi dear R community,

Sadly, an NDA prohibits me from sharing the real data. The data look something like this:

```##################################################################

patient <- data.frame(
day = c(1, 3, 5, 8, 10, 12, 15, 17, 19, 22, 24, 26),
protein = c(5, 3, 10, 7, 2, 8, 25, 22, 7, 10, 12, 5)
)

```

plot(patient\$day, patient\$protein, type="b")

# This is my model:

wave.form <-
deriv3( ~ sin(2*pi*((day-offset)/period + 0.25)) * amplitude + mean,

```         c("period", "offset", "amplitude", "mean"),
function(day, period, offset, amplitude, mean){})

curve(wave.form(x, period=7, offset=2, mean=5, amplitude=4),
from=1, to=30)

```

require(nlme)

wave.1 <- gnls(protein ~ wave.form(day, period, offset, amplitude, mean),

```               start=list(period=7, offset=0, amplitude=10, mean=5),
weights=varPower(), data=patient)

```

wave.1

```#################################################################

```

I emphasize that I don't care about the mean or amplitude, just the offset and the period. The problem for my model is that spikes in the data, such as at day 15, seem to make fitting the model quite unstable, (although it is fine in this artificial case). The best I can think of right now is to use the "weights=varPower()" model in gnls(), which I expect will down-weight the extreme spikes. Often that model fails to converge. Then, all I can do is fall back on modelling the log response. If I do both, then when the weights model converges the estimates sometimes differ, and sometimes considerably.

I know that I need more data, but more data are not available. So, instead I need to do the best I can :)

I have a half-baked idea of conditionally shrinking the peaks, so that they all look like the same size - perhaps by discretizing the data into, say, three possible values defined by quantiles of the response variable or the estimated slope - but I'm concerned that doing so risks creating the periodicity that I want to detect. Also the series is not necessarily stationary, so I wouild have to smooth it somehow first.

On the other hand, maybe I can test for periodicity in the original data, and then fit on some transformed data. If I reject the null hypothesis of no periodic oscillation, then is it reasonable to condition subsequent analysis on the belief that periodicity exists, and try to determine its characteristics?

I wonder if anyone can suggest an analysis technique or model that more directly accommodates the varying heights of the peaks? Or is this pretty much the best I can do?

Any suggestions will be gratefully received.

Cheers,

Andrew

```--
Andrew Robinson
Department of Mathematics and Statistics            Tel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia         Fax: +61-3-8344-4599
Email: a.robinson_at_ms.unimelb.edu.au         http://www.ms.unimelb.edu.au

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help