[R] LME & data with complicated random & correlational structures

From: Keith Chamberlain <Keith.Chamberlain_at_colorado.edu>
Date: Fri 02 Dec 2005 - 08:44:48 EST

Dear List,

This is my first post, and I'm a relatively new R user trying to work out a mixed effects model using lme() with random effects, and a correlation structure, and have looked over the archives, & R help on lme, corClasses, & etc extensively for clues. My programming experience is minimal (1 semester of C). My mentor, who has much more programming experience, but a comparable level of knowledge of R, has been stumped. What follows are 3 questions pertaining to an lme() model, one on the nested hierarcy, 1 on a strategy for a piecewise approach to the variance given I have ~24 hours of data (sampled at 32Hz, 1hr per subject), and one on the corStruct or how to get rid of serial dependencies before lme().

I'm analyzing skin temperature continuously recorded at 32Hz in Baseline (10 min), Testing (~5 min), and Recovery (20 min) epochs of a face recognition experiment. Stimuli are the same in Baseline and Recovery (portrait or landscape), and in testing, participants were tested on their recognition of a list b&w portraits presented just before testing started. On some of the portraits 'learned' the eyes were masked, and in others, the eyes were visible. In testing, the portraits have no masking but the stimuli in testing are labeled "Eyes" and "NoEyes". The data structure looks as follows:

There are 8 subjects

9 epochs - 6 of which were just "instruction" blocks, and one "Learning" period. Wrt lme(), I figured out how to use subset too isolate just the Baseline, Learning, and Testing Epochs (and avoid epochs with only 1 stimulus level, such as "instruction"). Data within each epoch are balanced wrt # trials, but not between epochs. Recovery has twice as many trials as Baseline, and Testing has about half. Time for each epoch is roughly that ratio too, although time in each trial differs.

Stimuli are the same in Baseline & Recovery, but different in Testing, although there are 2 levels in each used epoch.

Time & Temperature make up the time series, and Temperature is the dependent variable too stimulus.

1- are fixed effects and random effects discrete? That is, if I set something in my model formula as a fixed effect, then it does not make sense to set it as a random effect as well? The documentation (and posts) were not really clear on that point (not that the documentation technically 'should' be per say, just that I got confused).

The nested hierarchy for what actually gets analyzed looks as follows: Subj/Epoch/Stimulus/Temperature

Reasoning: there are several temperature samples recorded in each trial of Stimulus. Several stimuli in each Epoch, and all of the Epochs for one subject. Subject is random (theoretically) because of sampling in a population, Epoch would be fixed because all participants went through the same sequence of Epochs, but Stimulus varied randomly within an Epoch, which seems inconsistent when I apply it to the lme model as both a fixed and random effect.

Temperature ~ Stimulus-1, random=Subj|Subj/Epoch/Stimulus Subset= Epoch=="Baseline" | Epoch=="Testing" | Epoch=="Recovery"

I'm looking to correctly allocate error terms for between subjects (Subj) variability, and further delineate the within subject error between Epoch and Stimulus. The current model that I got to work (memory issues namely) is Temperature ~ Stimulus-1, random=Subj|Subj, which I decided to use to get the residuals to have the Subject variability accounted for and subtracted. Would a list of random structures work better? If so, is each item in the list structured just as the random formula? I haven't actually seen/found any examples of a list of random/nesting structures.

2- Is it possible to take a piecewise approach wrt the variance using lme(), such as modeling the variability of each subject first, then using further-nested terms in a model and the residuals from the previous? If so, what caveats exist for interpreting the variances?

I'm not interpreting p-values at this point because of another issue. When I try to set up the correlation structure, I run out of memory fast. I've tried this on a mac G5, an HP Pavilion dv1000 (= Pentium 2.6GHz), and a Gateway with an AMD athalon 900MHz processors. Each system has 386M memory or more, one of which has 1G.

3- Is there a way to get rid of the serial dependency BEFORE running the model with LME(), such as initiating a corStruct before placing it in the model? I'm working with so much data that I'm fine with doing the process piecewise. An AR process was difficult because the residuals are not the same length as the data file that I started with. Serial dependencies still gota go, whether via the correlation term in lme() or some other method, because I'll soon be breaking up the variance into components via spectrum().

So I might as well add a 4th. What's the term that gets me too data after AR() has done it's work? I'm thinking that resid() wasn't right but data that the data differ from their original length prior to an AR process may be how its done.

Psych Undergrad, CU Boulder
RE McNair Scholar

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Dec 02 08:47:58 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:41:25 EST