Re: [R] regexpr mystery can not remove trailing spaces

From: Petr PIKAL <petr.pikal_at_precheza.cz>
Date: Wed, 02 Jun 2010 16:55:35 +0200

Hi

I have original data for which sub(' +$', '', ...) did not work in Excel so I could try them again.

> grep("\t", as.character(becva$V1[1]))
integer(0)
> grep("\n", as.character(becva$V1[1]))
integer(0)

and Jim's solutions work as expected

> sub('[[:space:]]+$', '', becva$V1[1])
[1] "02.06.10 12:40"
> sub('\\W+$', '', becva$V1[1])

[1] "02.06.10 12:40"
> sub('+$', '', becva$V1[1])

[1] "02.06.10 12:40 "

However with data updated directly from internet there is no problem and all above commands work without problems. There could be some Excel data issues which is not worth solving.

Thank to you all.

Regards
Petr

Joris Meys <jorismeys_at_gmail.com> napsal dne 02.06.2010 16:11:05:

> Hi Petr,
>
> Matt may very well have been right. As I copied the dput from the mail,
any
> white space is converted to spaces apparently. Still, it might be
possible the
> white spaces in your original data are tabs or even newline characters.
You
> can check that easily with
>
> grep("\t", as.character(becva$V1[1]))
> grep("\n", as.character(becva$V1[1]))
>
> Cheers
> Joris
>
>

> On Wed, Jun 2, 2010 at 3:54 PM, Petr PIKAL <petr.pikal@precheza.cz>
wrote:
> Hi
>
> thanks. I am puzzled what was wrong. Now even
>
> sub(' +$', '', bbb[1])
>
> works. I am checking water throughput in nearby river and copying data
> from internet. So I wonder if there was some change recently as during
> floods they update it in about 10 minutes interval.
>
> Regards
> Petr
>
>
> jim holtman <jholtman_at_gmail.com> napsal dne 02.06.2010 15:44:42:
>
> > You had the wrong case on 'w' and the wrong expression with
> > [:space:]'; see below
> >
> > > bbb <- c("02.06.10 12:40 ", "02.06.10 12:00 ", "02.06.10 11:00
",
> > + "02.06.10 10:00 ", "02.06.10 09:00 ", "02.06.10 08:00 ",
> > + "02.06.10 07:00 ", "02.06.10 06:00 ", "02.06.10 05:00 ",
> > + "02.06.10 04:00 ", "02.06.10 03:00 ", "02.06.10 02:00 ",
> > + "02.06.10 01:00 ", "02.06.10 00:00 ", "01.06.10 23:00 ",
> > + "01.06.10 22:00 ", "01.06.10 21:00 ", "01.06.10 20:00 ",
> > + "01.06.10 19:00 ", "01.06.10 18:00 ", "01.06.10 17:00 ",
> > + "01.06.10 16:00 ", "01.06.10 15:00 ", "01.06.10 14:00 ",
> > + "01.06.10 13:00 ", "01.06.10 05:00 ", "31.05.10 05:00 ",
> > + "30.05.10 05:00 ", "29.05.10 05:00 ", "28.05.10 05:00 ",
> > + "27.05.10 05:00 ")
> > > sub('\\W+$', '', bbb[1])
> > [1] "02.06.10 12:40"
> > > sub('[[:space:]]+$', '', bbb[1])
> > [1] "02.06.10 12:40"
> > >
> >
> >
> > On Wed, Jun 2, 2010 at 9:22 AM, Petr PIKAL <petr.pikal_at_precheza.cz>
> wrote:
> > > Hi
> > >
> > >> dput(bbb)
> > > c("02.06.10 12:40 ", "02.06.10 12:00 ", "02.06.10 11:00 ",
> > > "02.06.10 10:00 ", "02.06.10 09:00 ", "02.06.10 08:00 ",
> > > "02.06.10 07:00 ", "02.06.10 06:00 ", "02.06.10 05:00 ",
> > > "02.06.10 04:00 ", "02.06.10 03:00 ", "02.06.10 02:00 ",
> > > "02.06.10 01:00 ", "02.06.10 00:00 ", "01.06.10 23:00 ",
> > > "01.06.10 22:00 ", "01.06.10 21:00 ", "01.06.10 20:00 ",
> > > "01.06.10 19:00 ", "01.06.10 18:00 ", "01.06.10 17:00 ",
> > > "01.06.10 16:00 ", "01.06.10 15:00 ", "01.06.10 14:00 ",
> > > "01.06.10 13:00 ", "01.06.10 05:00 ", "31.05.10 05:00 ",
> > > "30.05.10 05:00 ", "29.05.10 05:00 ", "28.05.10 05:00 ",
> > > "27.05.10 05:00 ")
> > >>
> > >
> > > For simplicity I change the name and put it to single variable.
> > > I also reinstalled R to recent R-devel
> > >
> > >> sub('\\w+$', '', bbb[1])
> > > [1] "02.06.10 12:40 "
> > >> sub('[:space:]', '', bbb[1])
> > > [1] "02.06.10 1240 "
> > >>
> > >
> > > I also tried Matt's suggestion but it did not help.
> > >
> > > Regards
> > > Petr
> > >
> > > Joris Meys <jorismeys_at_gmail.com> napsal dne 02.06.2010 14:35:19:
> > >
> > >> Could you provide us with dput(becva$V1[1])?
> > >> Cheers
> > >> Joris
> > >
> > >> On Wed, Jun 2, 2010 at 2:07 PM, Petr PIKAL <petr.pikal_at_precheza.cz>
> > > wrote:
> > >> Dear all
> > >>
> > >> I encountered strange problem with regexpr replacement
> > >>
> > >> I made this character object
> > >>
> > >> str <- "02.06.10 12:40 "
> > >>
> > >> > str(str)
> > >> chr "02.06.10 12:40 "
> > >>
> > >> I read in an object which seems to be quite similar
> > >>
> > >> > str(as.character(becva$V1)[1])
> > >> chr "02.06.10 12:40 "
> > >>
> > >> However I can not remove trailing spaces from it
> > >>
> > >> > sub(' +$', '', as.character(becva$V1[1]))
> > >>
> > >> [1] "02.06.10 12:40 "
> > >> > sub(' +$', '', str)
> > >> [1] "02.06.10 12:40"
> > >> >
> > >>
> > >> Do somebody have an idea what to do?
> > >>
> > >> $version.string
> > >> [1] "R version 2.12.0 Under development (unstable) (2010-04-25
> r51820)"
> > >>
> > >> on Windows
> > >>
> > >> Regards
> > >> Petr
> > >>
> > >> ______________________________________________
> > >> R-help_at_r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >>
> > >>
> > >> --
> > >> Joris Meys
> > >> Statistical Consultant
> > >>
> > >> Ghent University
> > >> Faculty of Bioscience Engineering
> > >> Department of Applied mathematics, biometrics and process control
> > >>
> > >> Coupure Links 653
> > >> B-9000 Gent
> > >>
> > >> tel : +32 9 264 59 87
> > >> Joris.Meys_at_Ugent.be
> > >> -------------------------------
> > >> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
> > >
> > > ______________________________________________
> > > R-help_at_r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
> >
> > --
> > Jim Holtman
> > Cincinnati, OH
> > +1 513 646 9390
> >
> > What is the problem that you are trying to solve?

>
>
>
> --
> Joris Meys
> Statistical Consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Applied mathematics, biometrics and process control
>
> Coupure Links 653
> B-9000 Gent
>
> tel : +32 9 264 59 87
> Joris.Meys_at_Ugent.be
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 02 Jun 2010 - 14:58:25 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 02 Jun 2010 - 15:00:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive