From: Spencer Graves <spencer.graves_at_pdf.com>

Date: Mon 29 May 2006 - 06:45:23 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon May 29 06:52:27 2006

Date: Mon 29 May 2006 - 06:45:23 EST

**PAIRWISE KOLMOGOROV-SMIRNOV:
**
I don't know, but it looks like you could just type "pairwise.t.test"
at a command prompt, copy the code into an R script file, and create a
function "pairwise.ks.test" just by changing the call to "t.test" with
one to "ks.test". Try it. If you have trouble making it work, submit a
post on that.

I would NOT do this, however, because the "ks.test" assumes samples of INDEPENDENT observations. If you've got time series, I would expect the assumption of independence to be violated, and I would not believe the results of a KS test. If you what to try what I just suggested, please also try it with multiple time series WITHOUT "varying our representation of the stream within the model", preferably several times.

**COMPARING MULTIPLE TIME SERIES
**
If I had k different time series to compare, I might proceed as
follows:

- Make normal probability plots using, e.g., qqnorm. If the observations did NOT look normal, I'd consider some transformation. If the numbers were all positive, I might consider using the "boxcox" function in library(MASS) to help select one. However, I wouldn't completely believe the results, because this also assumes the observations are independent, and I know they're not.
- Try to fit some traditional time series model as describe, e.g., in the chapter on time series on Venables and Ripley (2002) Modern Applied Statistics with S (Springer). There are better books on time series, but this is probably the first book I would recommend to anyone using R, and this chapter would be a reasonable start. I'd play with this until I seemed to get sensible fits for nearly all series with the same model and with residuals that looked fairly though not totally (a) white by the Box-Ljung criteria, and (b) normal in normal probability plots. If I saw consistent non-normal behavior in the residuals, it would indicate a problem bigger than I can handle in a brief email like this.
- With k different time series, most of the results of "2" could be summarized in k sets of estimated regression coefficients, all for the same model, with estimated standard errors plus whitened residuals. If you had m parameters, each pair of time series could then be summarized into m z-scores = (b.i-b.j)/(var.b.i+var.b.j), which could then be further converted into m p.values. You would then add the p.values from ks.test, making (m+1) p.values for each of the k*(k-1)/2 = 10 pairs of series with k = 5 series. I'd then feed these k*(m+1) p.values into "p.adjust" to get an answer. (Note: "pairwise.t.test" calls "pairwise.table", which further calls "p.adjust". I didn't know any of this before I read your post.) I might experiment with the different "methods" for p.adjust, and I got different answers from the different methods, I might worry about which to believe. The Bonferroni is the simplest, most widely known and understood, but also perhaps the most conservative. I might tend to believe some of the others more, but if I got different answers, I'd suspect that the case was marginal, and I might want to generate other sets of simulations and try those.
- There are other facilities in R for multiple comparisons, e.g., in the multcomp and pgirmess packages. Before I actually undertook steps 1, 2, and 3, above, I might review these packages to familiarize myself more with their contents.
- Virginia Tech has an excellent Statistics department with a consulting center. You might try them.

hope this helps, Spencer Graves

Kyle Hall wrote:

> I am interested in a statistical comparison of multiple (5) time series'

*> generated from modeling software (Hydrologic Simulation Program Fortran). The
**> model output simulates daily bacteria concentration in a stream. The multiple
**> time series' are a result of varying our representation of the stream within
**> the model.
**>
**> Our main question is: Do the different methods used to represent a stream
**> produce different results at a statistically significant level?
**>
**> We want to compare each otput time series to determine if there is a
**> difference before looking into the cause within the model. In a previous
**> study, the Kolmogorov-Smirnov k-sample test was used to compare multiple time
**> series'.
**>
**> I am unsure about the strength of the Kolmogorov-Smirnov test and I have set
**> out to determine if there are any other tests to compare multiple time
**> series'.
**>
**> I know htat R has the ks.test but I am unsure how this test handles multiple
**> comparisons. Is there something similar to a pairwise.t.test with a
**> bonferroni corection, only with time series data?
**>
**> Does R currently (v 2.3.0) have a comparison test that takes into account the
**> strong serial correlation of time series data?
**>
**>
**> Kyle Hall
**>
**> Graduate Research Assistant
**> Biological Systems Engineering
**> Virginia Tech
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon May 29 06:52:27 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Mon 29 May 2006 - 08:10:31 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*