Re: [Rd] unexpected result from reshape

From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>
Date: Sat, 24 Nov 2007 18:57:50 +0100

Antonio, Fabio Di Narzo wrote:
> Hi all.
> I have unexpected reshape results on datasets with certain variable
> names. Here a reproducible example:
>
> d <- matrix(seq_len(7*7), 1, 7*7)
> vnames <- c('acc','ppeGross','CF','ROA','DeltaSales','invTA','DeltaRevDeltaRec')
> varying <- unlist(lapply(vnames, paste, 1:7, sep='.'))
> d <- data.frame(d)
> names(d) <- varying
> d1 <- reshape(d, varying=varying, direction="long")
> d[,'ppeGross.2'] == d1[d1$time==2,'ppeGross'] #This is FALSE!
> ##Try to compare d and d1: values are wrong from the 2nd column
>
> ##Changing variable names makes thinks go right:
> vnames <- letters[1:7]
> varying <- unlist(lapply(vnames, paste, 1:7, sep='.'))
> names(d) <- varying
> d1 <- reshape(d, varying=varying, direction="long")
> d[,'b.2'] == d1[d1$time==2,'b'] #This is TRUE, as expected
> ##Try to compare d and d1 now: they look right
>
> Any hint on what's wrong here? By now, my workarond is changing
> variable names before reshaping, than re-assign old variable names
> back after reshape.
>
> Best regards,
> Antonio, Fabio Di Narzo.
>
Ouch. This was dumb (*): The problem is the guess() function using split(nms, nn[,1]), which implicitly runs factor(nn[,1]) and so gives out the groups in the order of sort(unique(nn[,1])), but later on we just use unique(nn[,1]).

Fortunately, this is wrong enough and trivial enough to fix, that it can make it into 2.6.1.

    -pd

(*) I think I wrote it, so I can say so.
>
>> R.version
>>
> _
> platform i686-pc-linux-gnu
> arch i686
> os linux-gnu
> system i686, linux-gnu
> status
> major 2
> minor 6.0
> year 2007
> month 10
> day 03
> svn rev 43063
> language R
> version.string R version 2.6.0 (2007-10-03)

>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
   O__  ---- Peter Dalgaard             ุster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard_at_biostat.ku.dk)                  FAX: (+45) 35327907

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Sat 24 Nov 2007 - 18:02:49 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 24 Nov 2007 - 22:30:30 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.