Re: [R] CSV file and date. Dates are read as factors!

From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>
Date: Fri 29 Jul 2005 - 01:15:36 EST

Don MacQueen <macq@llnl.gov> writes:

> It's really pretty simple.
>
> First, if you supply as.is=TRUE to read.csv() [or read.table()] then
> your dates will be read as character strings, not factors. That saves
> the step of converting them from factor to character.
>
> Then, use as.Date() to convert the date columns to objects of class
> "Date". You will have to specify the format, if your dates are not in
> the default format.
>
> > tmp <- as.Date('2002-5-1')
> > as.Date(Sys.time())-tmp
> Time difference of 1184 days
>
> If your dates include times, then use as.POSIXct() instead of as.Date().
>
> > tmp <- as.POSIXct('2002-5-1 13:21')
> > Sys.time()-tmp
> Time difference of 1183.746 days
>
> If you don't want to use as.is, perhaps because you have other
> columns that you *want* to have as factors, then either supply
> colClasses to read.csv, or else just use format() to convert the
> factors to character.
>
> as.Date(format(your_date_column))

 Actually, you can forget about the as.is stuff from 2.1.1 onwards since as.Date works happily with factors:

> as.Date.factor

function (x, ...)
as.Date(as.character(x), ...)

(previous versions forgot to pass the ... arguments so it only worked there if the standard format was used.) I suspect that as.character() is preferable to format() - there could be issues with padding.

However, you can apply as.is selectively on columns: It can be a logical vector or a vector of indices (numeric or character).  

> As an aside, you might save yourself some time by using read.xls()
> from the gdata package.
>
> And of course, there's always the ugly work-around. In your Excel,
> create new columns in which the dates are formatted as numbers,
> presumably as the number of days since whatever Excel uses for its
> origin. Then, in R, you can simply subtract the numbers. If you have
> date-time values in Excel, this might be a little trickier.
>
> -Don
>
> At 9:28 PM -0400 7/27/05, John Sorkin wrote:
> >I am using read.csv to read a CSV file (produced by saving an Excel file
> >as a CSV file). The columns containing dates are being read as factors.
> >Because of this, I can not compute follow-up time, i.e.
> >Followup<-postDate-preDate. I would appreciate any suggestion that would
> >help me read the dates as dates and thus allow me to calculate follow-up
> >time.
> >Thanks
> >John
> >
> >John Sorkin M.D., Ph.D.
> >Chief, Biostatistics and Informatics
> >Baltimore VA Medical Center GRECC and
> >University of Maryland School of Medicine Claude Pepper OAIC
> >
> >University of Maryland School of Medicine
> >Division of Gerontology
> >Baltimore VA Medical Center
> >10 North Greene Street
> >GRECC (BT/18/GR)
> >Baltimore, MD 21201-1524
> >
> >410-605-7119
> >--- NOTE NEW EMAIL ADDRESS:
> >jsorkin@grecc.umaryland.edu
> >
> >______________________________________________
> >R-help@stat.math.ethz.ch mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>
> --
> --------------------------------------
> Don MacQueen
> Environmental Protection Department
> Lawrence Livermore National Laboratory
> Livermore, CA, USA
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
   O__  ---- Peter Dalgaard             ุster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)                  FAX: (+45) 35327907

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Fri Jul 29 01:19:40 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:34:07 EST