From: ONKELINX, Thierry <>
Date: Mon 25 Sep 2006 - 12:11:44 GMT

Your problem would be a lot easier if you coded the location in one variable instead of three variables. Then you could calculate the means with one line of code:

by(results$q1, results$location, mean)

With your dataset you could use

by(results$London, results$location, mean)
by(results$Rome, results$location, mean)
by(results$Vienna, results$location, mean)

see ?by for more information

And take a good look at your code. You take a subset from results and the assign it to results. This means that you replace the original results dataframe with a subset of it. As you take the subset for the next city, you won't take a subset from the original dataset but for the previous subset!



Hello all,

I hope i chose the right list as my question is a beginner-question.

I have a data set with 3 colums "London", "Rome" and "Vienna" - the location is presented through a 1 like this:

London 	Rome 	Vienna	q1
0		0	1		4
0		1	0		2	
1		0	0		3


I just want to calculate the means of a variable q1.

I tried following script:

# calculate the mean of all locations
results <- subset(results, subset== 1 )
# calculate the mean of London
results <- subset(results, subset== 1 , select=c(London)) mean(results$q1)
# calculate the mean of Rome
results <- subset(results, subset== 1 , select=c(Rome)) mean(results$q1)
# calcualate the mean of Vienna
results <- subset(results, subset== 1 , select=c(Vienna)) mean(results$q1)

As all results are 1.68 and there is defenitely a difference in the three locations I wonder whats going on. I get confused as the Rcmdr asks me to overwrite things and there is no "just filter" option.

Any help would be apprechiated. Thank you in advance.


