Re: [Rd] (PR#8777) strsplit does [not] return correct value when spliting ""

From: Thomas Friedrichsmeier <thomas.friedrichsmeier_at_ruhr-uni-bochum.de>
Date: Mon 17 Apr 2006 - 21:50:38 GMT


Prof Brian Ripley wrote:
> On Mon, 17 Apr 2006, Charles Dupont wrote:
[...]
> > The man page states in the value section that strsplit returns:
> > A list of length 'length(x)' the 'i'-th element of which contains
> > the vector of splits of 'x[i]'.
> >
> > It mentions no change in behavior if the value of x[i] = "".
>
> There is none, for there are no splits in that case. I did ask you to
> point to the documentation of the rule you are assuming, and I can't find
> any.

No, the documentation does not explicitely mention this, but shouldn't "there are not splits" mean: So the string is returned unchanged? Consider these examples - I don't think that's the behavior you'd expect unless told otherwise:

a <- "a"
b <- ""

a == strsplit (a, ",")	# TRUE
b == strsplit (b, ",")	# FALSE

So, maybe there is a general rule that empty elements get purged?

strsplit ("a,,b", ",")
[[1]]
[1] "a" "" "b"

strsplit ("a", "a")
[[1]]
[1] ""

Apparently not so. Then why does an empty string get "split" to a non-existent string?

Note: I don't really care much about what the behavior is, but if the described behavior is indeed intended, I think it should be documented. IMO it's pretty counter intuitive.

Regards
Thomas



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Tue Apr 18 07:50:47 2006

This archive was generated by hypermail 2.1.8 : Tue 18 Apr 2006 - 00:17:45 GMT