Re: [R] dir() and RegEx and gsub()

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Fri 10 Jun 2005 - 03:10:09 EST

On 6/9/05, Hans-Peter <gchappi@gmail.com> wrote:
> Dear R-Users,
>
> I have two questions:
>
> a)
> in a directory there are 3 files:
> [1] "Data.~csv" "Kopie von Data.~csv" "VorlageTradefile.csv"
>
> The command "dir( fold, pattern = "\.csv" )" gives back *all* the 3 files
> With dir( fold, pattern = "\\.csv" ) I get back only VorlageTradefile.csv.
> I don't understand this behaviour, IMHO the regex expression "\.csv"
> becomes the string ".csv" and "\\.csv" becomes "\.csv". So the first
> string should catch it. This is also consistent with the result when I
> tried with the TRegExpr Tool. Could somebody explain what's going on
> here?

The dot (.) is a wildcard that matches any character so .csv will match the ~csv since the . matches the ~.

By the way, note that

  1. "[.]csv" is one way to specify a literal dot without using backslashes
  2. you probably want "[.]csv$" so that a.csv.txt is not matched.
  3. Some regular expression functions have a fixed= argument that causes them to regard all special characters like . and * as regular characters but unfortunately dir lacks that argument.

>
> b)
> I need to handle a copied windows file path. This is certainly often
> asked but I didn't find a solution.
> How can I convert, e.g.
>
> myfile <- "D:\UebungenNDK\DataMining\DataMiningSeries.r"

Variable myfile, as you have written it above, has no backslashes in it so there is no way way to know where they are supposed to be. Maybe \ what you mean is that you have a variable that is _stored_ as:

D:\UebungenNDK\...etc..

In that case its already the same as myfile <- "D:\\UebungenNDK\\...etc.." Use nchar to check how many characters are stored.

e.g.

nchar("D:\\abc") # there are 6, not 7, characters in this string

> in either:
>
> myfile
> [1] "D:\\UebungenNDK\\DataMining\\DataMiningSeries.r"
>
> or:
> myfile
> [1] "D:/UebungenNDK/DataMining/DataMiningSeries.r"
>
> Would be great to hear about a possibility!

You can convert backslashes to forward slashes using gsub

gsub("\\", "/", "D:\\abc", fixed = TRUE)

Note that internally Windows understands forward slashes although many of the Windows commands do not.

In case I did not understand your question have a look at ?file.path and also ?glob2rx in package sfsmisc. The first one will construct paths and the second one allows you specify wildcards using globbing instead of regular expressions.



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Jun 10 03:42:58 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:28 EST