Re: [Rd] Incorrect behavior for ordering timepoints in "reshape" (PR#7669)

From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>
Date: Tue 08 Feb 2005 - 10:38:28 EST

davclark@nyu.edu writes:

> Full_Name: Dav Clark
> Version: 2.0.1
> OS: OS X 10.3
> Submission from: (NULL) (128.122.87.35)
>
>
> When the timepoints that reshape uses (in direction="long") are negative or
> fractional, the time label is assigned incorrectly. It is easier to give an
> example than to describe the problem abstractly:
>
> Assume you have a data.frame header with values related to peri-stimulus time
> like this:
>
> "HRF -5" "HRF -2.5" "HRF 0" "HRF 2.5" ... "HRF 10"
>
> And you give reshape a split argument of a space " ".
>
> Then the labels will be assigned strangely, based on alphabetical ordering. So
> the above list order maps to:
>
> -2.5, -5, 0, 10, ... 2.5
>
> Items under the "HRF -5" column in wide format recieve a -2.5 label, items under
> "HRF 2.5" receive a label of 10, and so on.
>
> Somewhere, the time labels are being used before conversion to numbers. But,
> reshape returns an error if it is not possible to convert the timepoints to
> numeric! So obviously, more functionality could be provided, or at least the
> documentation should reflect the current shortfall.
>
> For completeness, here is a minimal example demonstrating the bug:
>
> df <- data.frame(id="S1", V1="from -2", V2="from -1")
> names(df)[2:3] <- c("vals.-2", "vals.-1")
> df
> reshape(df, direction="long", varying=2:3)

Hmm, this looks messed up even without the negatives. The guess() function inside reshape always sorts before converting to numeric, so you get the 1 10 11 2 3 4 5 6 7 8 9 effect, but what is worse: the sorting decouples the values from the variable names, as demonstrated by modifying your example slightly

> reshape(df, direction="long", varying=3:2)

      id time vals
S1.-1 S1 -1 from -1
S1.-2 S1 -2 from -2

I'm not at all sure I understand what was supposed to happen here, perhaps the sort in

    varying <- unique(nn[, 1])
    times <- sort(unique(nn[, 2]))

is a thinko? Over to Thomas, I think.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907

______________________________________________
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Tue Feb 08 09:48:16 2005

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 09:02:45 EST