[R] type conversion with apply or not

From: Horace Tso <Horace.Tso_at_pgn.com>
Date: Tue, 08 Jun 2010 14:19:07 -0700


Folks, i thought it should be straightforward but after a few hours poking around, I decided it's best to post my question on this list.

I have a data frame consisting of a (large) number of date columns, which are read in from a csv file as character string. I want to convert them to Date type. Following is an example, where the first column is of integer type, while the rest are type character.

> head(df)

  TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB

1  250897         7/1/2010      7/31/2010  PSTRUCT    Z
2  250617         8/1/2010      8/31/2010  PSTRUCT    Z
3  250364         4/1/2011      6/30/2011      PLR    Z
4  250176         4/1/2011      6/30/2011      PLR    Z
5  250176         4/1/2011      6/30/2011      PLR    Z
6  250364         4/1/2011      6/30/2011      PLR    Z

> sapply(df, class)
TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB "integer" "character" "character" "character" "character"
I thought it's just a matter of applying with a as.Date,

df[,c(2,3)] = apply(df[,c(2,3)], 2, function(x)as.Date(x,"%m/%d/%Y"))

Well, the Date conversion fails and I got,

  TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB

1  250897            14791          14821  PSTRUCT    Z
2  250617            14822          14852  PSTRUCT    Z
3  250364            15065          15155      PLR    Z
4  250176            15065          15155      PLR    Z
5  250176            15065          15155      PLR    Z
6  250364            15065          15155      PLR    Z
The character columns are indeed converted, but they became integer, not Date type. OK, that's strange and so I started reading the help pages.

It turns out in apply, the result is coerced to some basic vector types. And apparently Date is not one of the basic vector types.

"In all cases the result is coerced by as.vector<http://127.0.0.1:19182/library/base/help/as.vector> to one of the basic vector types before the dimensions are set, so that (for example) factor results will be coerced to a character array. "

The question then is how type conversion can be carried out on some columns of a data frame without using a loop.

Thanks.

Horace Tso

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 08 Jun 2010 - 21:22:13 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 08 Jun 2010 - 21:40:28 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive