I am a new user of R and trying to assess the sample size for data that is being collected on water quality at sites across a wide geographic region. A preliminary set of data has been collected and I would like to use it to assess whether we are collecting enough data and in the right places.

A factorial approach was initially used to characterize sites by well type, latrine type, distance between well and latrine, and ecological region. Altogether the basic structure has:

3 types of wells
3 types of latrines
4 distance categories
13 regions

We define a “site-type” as: a well-latrine-distance combination. There are 36 of these. A number of replicates (between 1 and 4) of the 36 site-types are included in the set of sites in each of the 13 regions. Some regions have more replicates than others due to complexity in the region. In total there are 936 sites.

At this point, I have an ANOVA model with water quality measures and only
(these) categorical data. I want to know if I am collecting enough samples
(given alpha and beta levels) to see if there are effects for wells,
latrines, distances, and region (independently), as well as interactions for well-distance, well-latrine, and well-region. I would like to also perform a power analysis to allow the power vary with sample size.

I am working my way through various texts and help functions but thought I would see if anyone else has learned how to do this already.

I would appreciate any and all guidance.

Best wishes, Chris

