Re: [R] dates in French format

From: Denis Chabot <chabotd_at_globetrotter.net>
Date: Thu, 31 Jan 2008 16:25:57 -0500

Hi all,

The crashes I reported earlier were cause by R 2.6.1 for Mac not liking the OS date setting "french canada", an issue that has been solved (by Simon Urbanek). The crashes did not occur when the OS was set to use normal french formats for dates. With that setting, the suggestions by Prof Ripley and Gabor all worked nicely.

Now that my dates are a chron object, I do have a new problem. The formatting of the dates on the x axis leaves to be desired. Instead of having day month and year, or at the very least day and month, I only get month and year so that many tick labels are identical. I also get a warning which puzzles me.

For instance:

 > start <- chron("12/01/2007")
 > other.dates <- seq(1,60,2)
 > Date <- start + other.dates
 > plot(1:length(Date)~Date)

6 ticks appear on the x axis. The first three are labeled "12/07" and the other three are labeled "01/08". I also get this:

Warning messages:
1: In v[[perm[1]]] : correspondance partielle de 'm' en 'month' 2: In v[[perm[2]]] : correspondance partielle de 'y' en 'year'

so there is only partial correspondance between "m" and "month" and between "y" and "year". Yet "Date" here is a proper chron object, so I fail to see why "correspondance" is only partial.

If I do Date2 <- as.Date(Date) and use this as my x axis, the six labels are more usable (déc 03, déc 13, déc 23, jan 02, jan 12, jan 22).

I suppose I can plot without x labels and draw my own, but I had not expected it would be necessary.

 > sessionInfo()
R version 2.6.1 (2007-11-26)
i386-apple-darwin8.10.1

locale:
fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] zoo_1.4-1 chron_2.3-16

loaded via a namespace (and not attached): [1] grid_2.6.1 lattice_0.17-2 tools_2.6.1

Denis

Le 31 janv. 08 à 09:46, Denis Chabot a écrit :

> (I've put the R Mac list in cc because of the crashes I have
> experienced trying some of the suggestions below)
>
> Hi Gabor and Prof Ripley,
>
> Le 31 janv. 08 à 02:11, Prof Brian Ripley a écrit :
>
>> The output from sessionInfo() the posting guide asked for would
>> have been very helpful here.
>
> You are right, sorry about that:
>
>
> > library(chron)
> > sessionInfo()
> R version 2.6.1 (2007-11-26)
> i386-apple-darwin8.10.1
>
> locale:
> fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] chron_2.3-16
>
>
>>
>>
>> I think the problem is likely to be that these are not standard
>> French
>> abbreviations according to my systems.
>
> I was ready to blame Excel for the use of non-standard
> abbreviations, but I would have been wrong: it seems that "janv" is
> a Mac OS X decision from what I can see in my system settings. I am
> not sure what would be a bullet-proof authority on french
> abbreviations. My dictionary was of no help, but wikipedia seems to
> endorse Mac OS X and Windows use of "janv":
>
> <http://fr.wikipedia.org/wiki/Mois#Abr.C3.A9viations>
>
>> On Linux I get
>>
>>> format(Sys.Date(), "%d-%b-%y")
>> [1] "31-jan-08"
>>> format(Sys.Date()-50, "%d-%b-%y")
>> [1] "12-déc-07"
>>
>> and on Windows
>>
>>> format(Sys.Date(), "%d-%b-%y")
>> [1] "31-janv.-08"
>>
>>> format(Sys.Date()-50, "%d-%b-%y")
>> [1] "12-déc.-07"
>
> I tried this too:
> > format(Sys.Date(), "%d-%b-%y")
> [1] "31-jan-08"
> > format(Sys.Date()-50, "%d-%b-%y")
> [1] "12-déc-07"
>
> I am lost here: since the OS uses "janv", why did the above give
> "jan"???
>
>>
>>
>> And yes, chron is US-centric and so only allows English names.
>>
>> Assuming you know exactly what is meant by 'French short format', I
>> think the simplest thing to do is to set up a table by
>>
>> tr <- month.abb
>> names(tr)[1] <- c("janv") # complete it
>>
>> x <- "9-janv-08"
>> x2 <- strsplit(x, "-")
>> x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,
>> collapse="-")})
>> as.Date(x3, format = "%d-%b-%y")
>
> Thank you Prof Ripley, although I'll have to do my homework to fully
> understand what is happening with the function you wrote.
>
> But I wonder why I cannot make this a Date object:
>
> > x <- "9-janv-08"
> > x2 <- strsplit(x, "-")
> > x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,
> collapse="-")})
> > as.Date(x3, format = "%d-%b-%y")
> [1] "2008-01-09"
> > class(x3)
> [1] "character"
> > x4 <- as.Date(x3, format = "%d-%b-%y")
>
> *** caught bus error ***
> address 0x8, cause 'non-existent physical address'
>
> Traceback:
> 1: strptime(x, format)
> 2: as.Date.character(x3, format = "%d-%b-%y")
> 3: as.Date(x3, format = "%d-%b-%y")
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
>
> The problem may be my system as I get this error when trying Gabor's
> suggestions (below).
>
> Le 31 janv. 08 à 00:21, Gabor Grothendieck a écrit :
>> Suppose we have:
>>
>> dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21-
>> déc-07",
>> "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
>> "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
>> "16-janv-08", "18-janv-08")
>>
>> Try this (where we are assuming the just released chron 2.3-17):
>>
>> library(chron)
>> Sys.setlocale("LC_ALL", "French")
>> as.chron(as.Date(dd, "%d-%b-%y"))
>>
>> # or with chron 2.3-16 last line is replaced with:
>> chron(unclass(as.Date(dd, "%d-%b-%y")))
>>
>
> > library(chron)
> > dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21-
> déc-07",
> + "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
> + "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
> + "16-janv-08", "18-janv-08")
> > Sys.setlocale("LC_ALL", "French")
> [1] ""
> Warning message:
> In Sys.setlocale("LC_ALL", "French") :
> la requête OS pour spécifier la localisation à "French" n'a pas pu
> être honorée
> > chron(unclass(as.Date(dd, "%d-%b-%y")))
>
> *** caught bus error ***
> address 0x8, cause 'non-existent physical address'
>
> Traceback:
> 1: strptime(x, format)
> 2: as.Date.character(dd, "%d-%b-%y")
> 3: as.Date(dd, "%d-%b-%y")
> 4: inherits(dates., "dates")
> 5: chron(unclass(as.Date(dd, "%d-%b-%y")))
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
>
>> If those don't work (the above didn't work on my Vista system but
>> this
>> is system dependent and
>> might work on yours) then try this alternative
>>
>>> library(chron)
>>> library(gsubfn)
>>> Sys.setlocale('LC_ALL','French')
>> [1] "LC_COLLATE=French_France.1252;LC_CTYPE=French_France.
>> 1252;LC_MONETARY=French_France.
>> 1252;LC_NUMERIC=C;LC_TIME=French_France.1252"
>>> french.months <- format(seq(as.Date("2000-01-01"), length = 12, by
>>> = "month"), "%b")
>>> f <- function (d, m, y) chron(paste(pmatch(m, french.months), d,
>>> y, sep = "/"))
>>> strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
>> [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07
>> 12/28/07
>> [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08
>> 01/16/08
>> [17] 01/18/08
>
> Again, this Sys.setlocale call does not work for me and the use of
> as.Date crashes my copy of R:
>
> > library(chron)
> > library(gsubfn)
> Le chargement a nécessité le package : proto
> > french.months <- format(seq(as.Date("2000-01-01"), length = 12, by
> = "month"), "%b")
>
> *** caught bus error ***
> address 0x8, cause 'non-existent physical address'
>
> Traceback:
> 1: strptime(x, f)
> 2: fromchar(x)
> 3: as.Date.character("2000-01-01")
> 4: as.Date("2000-01-01")
> 5: seq(as.Date("2000-01-01"), length = 12, by = "month")
> 6: format(seq(as.Date("2000-01-01"), length = 12, by = "month"),
> "%b")
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
>
> However, if I replace that call by this, the rest of Gabor's
> solution works.
>
> > library(chron)
> > library(gsubfn)
> Le chargement a nécessité le package : proto
> > french.months <- c("janv", "fév", "mars", "avr", "mai", "juin",
> "juil", "août", "sept", "oct", "nov", "déc")
> > dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21-
> déc-07",
> + "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
> + "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
> + "16-janv-08", "18-janv-08")
> > f <- function (d, m, y) chron(paste(pmatch(m, french.months), d,
> y, sep = "/"))
> > strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
> [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07
> 12/28/07
> [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08
> 01/16/08
> [17] 01/18/08
>
> So thanks again. I will try to reinstall R on my computer and see if
> I still get these errors.
>
>
> Denis
>
>>
>>
>>
>> On Jan 30, 2008 11:29 PM, Denis Chabot <chabotd_at_globetrotter.net>
>> wrote:
>>> Hello R users,
>>>
>>> I have to import a file with one column containing dates written in
>>> French short format, such as:
>>>
>>> 7-déc-07
>>> 11-déc-07
>>> 14-déc-07
>>> 18-déc-07
>>> 21-déc-07
>>> 24-déc-07
>>> 26-déc-07
>>> 28-déc-07
>>> 31-déc-07
>>> 2-janv-08
>>> 4-janv-08
>>> 7-janv-08
>>> 9-janv-08
>>> 11-janv-08
>>> 14-janv-08
>>> 16-janv-08
>>> 18-janv-08
>>>
>>> There are other columns for other (numeric) variables in the data
>>> file. In my read.csv2 statement, I indicate that the date column
>>> must
>>> be imported "as.is" to keep it as character.
>>>
>>> I would like to transform this into a date object in R. So far I've
>>> used chron for my dates and times needs, but I am willing to
>>> change if
>>> another object/package will ease the task of importing these dates.
>>>
>>> My reading of the chron help led me to believe that the formats it
>>> understands are only month names in English.
>>>
>>> Are there other "formats" I can use with chron, or must I somehow
>>> edit
>>> this character variables to replace French month names by English
>>> ones
>>> (or numbers from 1 to 12)?
>>>
>>> Thanks in advance,
>>>
>>> Denis
>>> p.s. I read this in digest mode, so I'll get your replies faster if
>>> you cc to my email
>
>
>
>
>
>
>
>
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 31 Jan 2008 - 21:30:38 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 31 Jan 2008 - 22:30:09 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive