Adaikalavan Ramasamy <ramasamy@cancer.org.uk> writes:

> I think it is doing what is supposed to do but I never used read.spss,

**> In R when you use as.integer on a factor, the one with the lowest level
**> gets a value of 1 and so on. The lowest level of the factor can
**> determined from levels() function.
**> f <- factor( c("Green", "Green", "Red", "Blue"),
**> levels=c("Red", "Blue", "Green") )
**> levels(f)
**> [1] "Red" "Blue" "Green"
**> as.integer(f)
**> [1] 3 3 1 2
**> But the levels of a factor can be changed
**> as.integer( factor( f, levels=c("Green", "Blue", "Red" ) ) )
**> [1] 1 1 3 2
Doesn't explain why 1 2 3 in the input file comes out as Green Blue Red, does it?

> You can also try setting use.value.labels=FALSE in read.spss function

*> and then creating a factor out of it.
Would be interesting to see this. I would suspect that the damage is already done at that point though.

I notice that the value labels are in reverse order. Shouldn't matter to read.spss which has

rval[[nm]] <- factor(rval[[nm]], levels = vl[[v]], labels = trim(names(vl[[v]])))

i.e. levels and labels should be in the correct order.

But something is odd, you'd expect the following effect:

> x <- 1:3

> factor(x,levels=3:1,labels=c("G","B","R"))

[1] R B G

Levels: G B R

> as.integer(factor(x,levels=3:1,labels=c("G","B","R")))

[1] 3 2 1

but Joel's output has the levels in the order R B G, which contradicts the

attr(,"label.table")$COLOR

BTW, this is R 2.1.1, I hope Joel isn't wasting our time by using an older version...

-p

> Regards, Adai

**> On Tue, 2005-07-26 at 17:04 -0700, Joel Bremson wrote:
**> > Hi,
**> >
**> > I'm having a problem with spss.read reversing my factor input.
**> >
**> > Here is the input copied from the spss data editor:
**> >
**> > color cost
**> > 1 2.30
**> > 2 2.40
**> > 3 3.00
**> > 1 2.10
**> > 1 1.00
**> > 1 2.00
**> > 2 4.00
**> > 2 3.20
**> > 2 2.33
**> > 3 2.44
**> > 3 2.55
**> > For color, red=1, blue=2, and green = 3. It's type is 'String' and
**> >
**> > >out=read.spss(file)
**> > >out
**> >
**> > $COLOR
**> > [1] green blue red green green green blue blue blue red red
**> > Levels: red blue green
**> >
**> > $COST
**> > [1] 2.30 2.40 3.00 2.10 1.00 2.00 4.00 3.20 2.33 2.44 2.55
**> >
**> > attr(,"label.table")
**> > attr(,"label.table")$COLOR
**> > green blue red
**> > 3 2 1
**> > attr(,"label.table")$COST
**> > NULL
**> > attr(,"variable.labels")
**> > COLOR COST
**> > "color" "cost"
**> > =====EOF===================
**> > Notice that the $COLOR factor data are inverted, looking at the integer
**> > output
**> > we see:
**> >
**> > > as.integer(out$COLOR)
**> > [1] 3 2 1 3 3 3 2 2 2 1 1
**> > The spss original data looks like this:
**> > 1 2 3 1 1 1 2 2 2 3 3
**> > I can easily invert the output mathematically with:
**> > q = sapply(m,function(x){ x + 2*(median(unique(m))-x)})
**> > (m is composed of sequential integers starting at one)
**> >
**> >
**> >
**> > Graduate Student
**> > UC Davis
**> > [[alternative HTML version deleted]]
-- O__ ---- Peter Dalgaard ุster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907

