Re: [Rd] (PR#8777) strsplit does [not] return correct value when spliting ""

From: Charles Dupont <charles.dupont_at_vanderbilt.edu>
Date: Mon 17 Apr 2006 - 17:27:19 GMT

Now using R 2.3.0.

I have a string that can be "". I want to find the max screen width of the all the lines in the string. so I run the command

  > x <- c("hello", "bob is\ngreat", "foo", "", "bar")
  > substrings <- strsplit(x, "\n"), type="width")
  > sapply(substrings, FUN=function(x) max(nchar(x, type="width")))
which returns
[1] 5 6 3 -Inf 3

This happens because of the behavior of strsplit for a string that is not ""   > strsplit("Hello\nBob", "\n")

it returns
[[1]]
[1] "Hello" "Bob"

for a string that is ""
  > strsplit("", "\n")

it returns
[[1]]
character(0)

I would expect
[[1]]
[1] ""

because "" is character vector of length 1 containing a string of length 0, not a character vector of length 0.

For any other string if the split string is not matched in argument x then it returns the original string x.

The man page states in the value section that strsplit returns:

      A list of length 'length(x)' the 'i'-th element of which contains
      the vector of splits of 'x[i]'.

It mentions no change in behavior if the value of x[i] = "".

Prof Brian Ripley wrote:
> Please use a current version of R: we are at 2.3.0RC (and we do ask you
> not to report on obselete versions).
>
> What rule are you using, and where did you find it in the R documentation?
>
> In fact
>

>> strsplit("", " ")

>
> [[1]]
> character(0)
>
> which is not as you stated. This is a feature, as it distinct from
>
>> strsplit(" ", " ")

>
> [[1]]
> [1] ""
>
> Consider also
>
>> strsplit("", "")

>
> [[1]]
> character(0)
>
>> strsplit("a", "")

>
> [[1]]
> [1] "a"
>
>> strsplit("ab", "")

>
> [[1]]
> [1] "a" "b"
>
>
> On Mon, 17 Apr 2006, charles.dupont@vanderbilt.edu wrote:
>
>> Full_Name: Charles Dupont
>> Version: 2.2.0
>> OS: linux
>> Submission from: (NULL) (160.129.129.136)
>>
>>
>> when
>>
>> strsplit("", " ")
>>
>> returns character(0)
>>
>> where as
>>
>> strsplit("a", " ")
>>
>> returns "a".
>>
>> these return values are not constiant with each other.
>>
>> Charles Dupont
>>
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>

>
-- 
Charles Dupont	Computer System Analyst		School of Medicine
		Department of Biostatistics	Vanderbilt University

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Tue Apr 18 03:30:19 2006

This archive was generated by hypermail 2.1.8 : Mon 17 Apr 2006 - 22:17:54 GMT