Re: [R] dates in French format

From: Denis Chabot <chabotd_at_globetrotter.net>
Date: Thu, 31 Jan 2008 09:46:20 -0500

(I've put the R Mac list in cc because of the crashes I have experienced trying some of the suggestions below)

Hi Gabor and Prof Ripley,

Le 31 janv. 08 02:11, Prof Brian Ripley a crit :

> The output from sessionInfo() the posting guide asked for would have
> been very helpful here.

You are right, sorry about that:

 > library(chron)
 > sessionInfo()
R version 2.6.1 (2007-11-26)
i386-apple-darwin8.10.1

locale:
fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] chron_2.3-16

>
>
> I think the problem is likely to be that these are not standard French
> abbreviations according to my systems.

I was ready to blame Excel for the use of non-standard abbreviations, but I would have been wrong: it seems that "janv" is a Mac OS X decision from what I can see in my system settings. I am not sure what would be a bullet-proof authority on french abbreviations. My dictionary was of no help, but wikipedia seems to endorse Mac OS X and Windows use of "janv":

<http://fr.wikipedia.org/wiki/Mois#Abr.C3.A9viations>

> On Linux I get
>
>> format(Sys.Date(), "%d-%b-%y")
> [1] "31-jan-08"
>> format(Sys.Date()-50, "%d-%b-%y")
> [1] "12-dc-07"
>
> and on Windows
>
>> format(Sys.Date(), "%d-%b-%y")
> [1] "31-janv.-08"
>
>> format(Sys.Date()-50, "%d-%b-%y")
> [1] "12-dc.-07"

I tried this too:
 > format(Sys.Date(), "%d-%b-%y")
[1] "31-jan-08"
 > format(Sys.Date()-50, "%d-%b-%y")
[1] "12-dc-07"

I am lost here: since the OS uses "janv", why did the above give "jan"???

>
>
> And yes, chron is US-centric and so only allows English names.
>
> Assuming you know exactly what is meant by 'French short format', I
> think the simplest thing to do is to set up a table by
>
> tr <- month.abb
> names(tr)[1] <- c("janv") # complete it
>
> x <- "9-janv-08"
> x2 <- strsplit(x, "-")
> x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,
> collapse="-")})
> as.Date(x3, format = "%d-%b-%y")

Thank you Prof Ripley, although I'll have to do my homework to fully understand what is happening with the function you wrote.

But I wonder why I cannot make this a Date object:

 > x <- "9-janv-08"
 > x2 <- strsplit(x, "-")
 > x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,  
collapse="-")})
 > as.Date(x3, format = "%d-%b-%y")
[1] "2008-01-09"
 > class(x3)
[1] "character"
 > x4 <- as.Date(x3, format = "%d-%b-%y")

Traceback:

  1. strptime(x, format)
  2. as.Date.character(x3, format = "%d-%b-%y")
  3. as.Date(x3, format = "%d-%b-%y")

Possible actions:

1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

The problem may be my system as I get this error when trying Gabor's suggestions (below).

Le 31 janv. 08 00:21, Gabor Grothendieck a crit :
> Suppose we have:
>
> dd <- c("7-dc-07", "11-dc-07", "14-dc-07", "18-dc-07", "21-
> dc-07",
> "24-dc-07", "26-dc-07", "28-dc-07", "31-dc-07", "2-janv-08",
> "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
> "16-janv-08", "18-janv-08")
>
> Try this (where we are assuming the just released chron 2.3-17):
>
> library(chron)
> Sys.setlocale("LC_ALL", "French")
> as.chron(as.Date(dd, "%d-%b-%y"))
>
> # or with chron 2.3-16 last line is replaced with:
> chron(unclass(as.Date(dd, "%d-%b-%y")))
>

 > library(chron)
 > dd <- c("7-dc-07", "11-dc-07", "14-dc-07", "18-dc-07", "21- dc-07",

+ "24-dc-07", "26-dc-07", "28-dc-07", "31-dc-07", "2-janv-08",
+ "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
+ "16-janv-08", "18-janv-08")

 > Sys.setlocale("LC_ALL", "French")
[1] ""
Warning message:
In Sys.setlocale("LC_ALL", "French") :

   la requte OS pour spcifier la localisation "French" n'a pas pu tre honore
 > chron(unclass(as.Date(dd, "%d-%b-%y")))

Traceback:

  1. strptime(x, format)
  2. as.Date.character(dd, "%d-%b-%y")
  3. as.Date(dd, "%d-%b-%y")
  4. inherits(dates., "dates")
  5. chron(unclass(as.Date(dd, "%d-%b-%y")))

Possible actions:

1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

> If those don't work (the above didn't work on my Vista system but this
> is system dependent and
> might work on yours) then try this alternative
>
>> library(chron)
>> library(gsubfn)
>> Sys.setlocale('LC_ALL','French')
> [1] "LC_COLLATE=French_France.1252;LC_CTYPE=French_France.
> 1252;LC_MONETARY=French_France.
> 1252;LC_NUMERIC=C;LC_TIME=French_France.1252"
>> french.months <- format(seq(as.Date("2000-01-01"), length = 12, by
>> = "month"), "%b")
>> f <- function (d, m, y) chron(paste(pmatch(m, french.months), d, y,
>> sep = "/"))
>> strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
> [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07
> 12/28/07
> [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08
> 01/16/08
> [17] 01/18/08

Again, this Sys.setlocale call does not work for me and the use of as.Date crashes my copy of R:

 > library(chron)
 > library(gsubfn)
Le chargement a ncessit le package : proto  > french.months <- format(seq(as.Date("2000-01-01"), length = 12, by = "month"), "%b")

Traceback:

  1. strptime(x, f)
  2. fromchar(x)
  3. as.Date.character("2000-01-01")
  4. as.Date("2000-01-01")
  5. seq(as.Date("2000-01-01"), length = 12, by = "month")
  6. format(seq(as.Date("2000-01-01"), length = 12, by = "month"), "%b")

Possible actions:

1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

However, if I replace that call by this, the rest of Gabor's solution works.

 > library(chron)
 > library(gsubfn)
Le chargement a ncessit le package : proto  > french.months <- c("janv", "fv", "mars", "avr", "mai", "juin", "juil", "aot", "sept", "oct", "nov", "dc")  > dd <- c("7-dc-07", "11-dc-07", "14-dc-07", "18-dc-07", "21- dc-07",

+ "24-dc-07", "26-dc-07", "28-dc-07", "31-dc-07", "2-janv-08",
+ "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
+ "16-janv-08", "18-janv-08")

 > f <- function (d, m, y) chron(paste(pmatch(m, french.months), d, y, sep = "/"))
 > strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)   [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07 12/28/07
  [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08 01/16/08
[17] 01/18/08

So thanks again. I will try to reinstall R on my computer and see if I still get these errors.

Denis

>
>
>
> On Jan 30, 2008 11:29 PM, Denis Chabot <chabotd@globetrotter.net>
> wrote:
>> Hello R users,
>>
>> I have to import a file with one column containing dates written in
>> French short format, such as:
>>
>> 7-dc-07
>> 11-dc-07
>> 14-dc-07
>> 18-dc-07
>> 21-dc-07
>> 24-dc-07
>> 26-dc-07
>> 28-dc-07
>> 31-dc-07
>> 2-janv-08
>> 4-janv-08
>> 7-janv-08
>> 9-janv-08
>> 11-janv-08
>> 14-janv-08
>> 16-janv-08
>> 18-janv-08
>>
>> There are other columns for other (numeric) variables in the data
>> file. In my read.csv2 statement, I indicate that the date column must
>> be imported "as.is" to keep it as character.
>>
>> I would like to transform this into a date object in R. So far I've
>> used chron for my dates and times needs, but I am willing to change
>> if
>> another object/package will ease the task of importing these dates.
>>
>> My reading of the chron help led me to believe that the formats it
>> understands are only month names in English.
>>
>> Are there other "formats" I can use with chron, or must I somehow
>> edit
>> this character variables to replace French month names by English
>> ones
>> (or numbers from 1 to 12)?
>>
>> Thanks in advance,
>>
>> Denis
>> p.s. I read this in digest mode, so I'll get your replies faster if
>> you cc to my email



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 31 Jan 2008 - 14:50:34 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 31 Jan 2008 - 22:30:09 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive