Re: [R] How should I organize data to compare differences in matchedpairs?

From: Greg Snow <Greg.Snow_at_imail.org>
Date: Thu, 24 Jan 2008 12:05:41 -0700

Here is how I would do it (there are multiple ways you could do it, so there is not single "Right" answer):

Assign each person a unique identifier.

Put all the information from the questionaire along with the idenifier and anything else that does not change between rounds (age, sex, height, ...) into one data frame. This df will have as many rows as you have subjects.

The round information then goes into a second data frame with each round being a row (each subject has multiple rows) and include the unique identifier on each row for that person.

If you need information combined from both data frames, then use the merge function to merge the 2 data frames (or subsets of them) together.

Advantages of this method include:

Uses data frames which most of the analysis functions expect. Each piece of data is only entered once (other than the id)

Disadvantage:

Data is split between 2 objects.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow_at_imail.org
(801) 408-8111
 
 


> -----Original Message-----
> From: r-help-bounces_at_r-project.org
> [mailto:r-help-bounces_at_r-project.org] On Behalf Of Thomas Levine
> Sent: Thursday, January 24, 2008 11:43 AM
> To: r-help_at_r-project.org
> Subject: [R] How should I organize data to compare
> differences in matchedpairs?
>
> I'm just learning how to use R right now, so I'm not sure
> what the most efficient way to organize these data is.
>
> I had subjects perform the same task twice with slight
> changes between the rounds. I want to analyze differences
> between the rounds. All of the subjects also answered a questionnaire.
>
> Putting all of one subject's information on one row seems sloppy.
>
> I was thinking about making a three-dimensional array with
> subject number, round and measurement as axes, but then the
> differences would have to be the third column in the round
> axis, which also seemed messy. Also, I would have duplicates
> of all of the information from the questionnaire, which seems
> inefficient.
>
> Or maybe I could just use a matrix where round is just
> another column among all of the measurements. This is similar
> to the previous arrangement, but I don't know which is
> better. It still has all of the duplicated information that
> the previous method has.
>
> Anyway, I'm sure someone's done this before, so I'd like to
> see what other people have done for data like these.
>
> Thomas Levine
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Received on Thu 24 Jan 2008 - 19:12:58 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 25 Jan 2008 - 01:30:08 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive