Re: [Rd] binary string conversion to a vector (PR#14120)

From: <tplate_at_acm.org>
Date: Sat, 12 Dec 2009 20:00:28 +0100 (CET)


Just responding to some of the issues in this long post:

(1) Don't rely on the printed form of an object to decide whether or not they are identical. The function str() is very useful in this regard, and sometimes also unclass(). To see whether two object are identical, use the function identical()

 > qvector <- c("0", "0", "0", "1", "1", "0", "1")  > qvector[1]
[1] "0"
 > noquote(qvector[1])
[1] 0
 > str(noquote(qvector[1]))
Class 'noquote' chr "0"
 > as.integer(qvector[1])
[1] 0
 > str(as.integer(qvector[1]))
 int 0
 > identical(noquote(qvector[1]), as.integer(qvector[1])) [1] FALSE
 >

Does this alleviate the concern as to the possibility of a bug in noquote/as.integer? Or were there deeper issues?

(2) to see how some other users of R have package up miscellaneous functions that might be of use to other people, look for packages on CRAN with "misc" in their names -- I see almost 10 of them. The problem with just posting snippets of code is that they get lost in all the other posts here, and many long term R users have dozens if not hundreds of their own functions that are streamlined for their own frequent tasks and style of programming.

(3) sounds like a great idea to use R to bring statistical rigor into the analysis of the performance of combinatorial optimization algorithms!

(4) install.packages("stringr") works fine for me. Maybe it was a temporary glitch? Have you checked whether you have a valid repository selected? E.g., I have in my .Rprofile: options(repos=c(CRAN="http://cran.cnr.Berkeley.edu" , CRANextra="http://www.stats.ox.ac.uk/pub/RWin"))

Enjoy learning R!

Franc Brglez wrote:
> Hello!
>
> Please accept my sincere apologies for annoying the R development team with my post this week. If I were required to register as "a developer" before submission, this would not have happened. To rehabilitate myself, please find at the bottom of this mail two R-functions, 'string2vector' and 'vector2string', with "comments and tests". Both functions may go a long way towards assisting a number of R-users to make their R-programming more productive. I am a novice R-programmer: I started dabbling in R less than two months ago, heavily influenced by examples of code I see, including within the R.org documents (monkey does what monkey sees). Before posting two functions, I would really appreciate constructive edits where they may be needed as well as their posting by someone-in-the-know so there will be conveniently accessible for R users.
>
> I am very impressed with potential of R and the community supporting it. I just wish I got to R sooner: I am looking to R to better support my work in "designed experiments to assess the statistically significant performance of combinatorial optimization algorithms on instance isomorphs of NP-hard problems" -- for better context of this mouthful, see the few postings under
> http://www.cbl.ncsu.edu:16080/xBed/publications/
> I am working on a tutorial paper where I expect R to play a significant role in better explaining and illustrating, code-wise and graphically, the concepts discussed in the publications above. I would welcome a co-author with experience in R-programming as well as statistics and interests in the experimental methods addressed in these publications.
>
> As I elaborate in notes that follow, I was looking at a variety of "R-documents" before my "bug" submission. I would appreciate very much if some of you could take the time to scan through these notes and respond briefly with useful pointers. Here are the headlines:
>
> (1) why I still think there may be a bug with 'noquote' vs 'as.integer'
>
> (2) search on "split string" and "join string"; the missing package "stringr"
>
> (3) a take on "Tcl" commands 'split', 'join', 'string', 'append', 'foreach'
>
> (4) a take on "R" functions 'string2vector' and 'vector2string'
>
> (5) code and comments for "R" functions 'string2vector' and 'vector2string
>
> (1) why I still think there may be a bug with 'noquote' vs 'as.integer'
> --------------------------------------------------------------------------------
>
>> # MacOSX 10.6.2, R 2.9.1 GUI 1.28 Tiger build 32-bit (5444)
>> qvector
>>
> [1] "0" "0" "0" "1" "1" "0" "1"
>
>> qvector[1]
>>
> [1] "0"
>
>> tmp = noquote(qvector[1])
>> tmp
>>
> [1] 0
>
>> tmp = as.integer(qvector[1])
>> tmp
>>
> [1] 0
>
> When embedded in the function as per my "bug" report, 'noquote' and 'as.integer' are no longer equivalent whereas in the example above they appear to be equivalent!! I submitted the "function" with print/cat statements for sake of illustration.
>
> (2) search on "split string" and "join string"; the missing package "stringr"
> --------------------------------------------------------------------------------
> http://search.r-project.org/ reveals
> orderof 850 messages for search on "split string"
> orderof 160 messages for search on "join string"
>
> http://finzi.psych.upenn.edu/search.html reveals
> for search on "split string"
> • Rhelp08: [ split: 890 ] [ string: 1676 ] [ TOTAL: 77 ]
> • functions: [ split: 954 ] [ string: 6453 ] [ TOTAL: 204 ]
> for search on "join string"
> • Rhelp08: [ join: 176 ] [ string: 1676 ] [ TOTAL: 8 ]
> • functions: [ join: 192 ] [ string: 6453 ] [ TOTAL: 36 ]
> This site also provides a link to the package "stringr"
> http://finzi.psych.upenn.edu/R/library/stringr/html/00Index.html
> However, the download does not deliver ...
>
>> install.packages("stringr")
>>
> ....
> package ‘stringr’ is not available
>
> There are a lot of hard-to-understand and not-so-relevant code snippets in all these 1000's of postings. I would argue that had robust functions such as 'string2vector' and 'vector2string' been included in the R-package, many R-programmers could take longer vacations, spend their time more productively,
> and significantly reduce duplication of coding efforts on basically the same
> problems.
>
> Since vector is such and important "primitive" in R, I argue that functions such as 'string2vector' and 'vector2string' should be made to play a role similar to commands 'split', 'join', 'string', and 'append' that support programmers in Tcl. See my take on Tcl in the section below.
>
> (3) a take on "Tcl" commands 'split', 'join', 'string', 'append', 'foreach'
> --------------------------------------------------------------------------------
> I have been using Tcl to "wrap" a number of combinatorial solvers and automate workflows that implement and execute a number of my experiments on instance isomorphs. I even used Tcl to prototype few combinatorial optimization algorithm prototypes and write code for statistical analysis -- as task for which I now find R much better suited.
>
> I intend to alert my Tcl colleagues in-the-know about the wonderful infrastructure provided in R when it comes to the R-shell (at least under MacOSX), and the ability to name and initialize function variable defaults explicitly, and the ability to install new packages so transparently. Before coming across R, I already took the trouble to create Tcl wrapper programs with command lines that feature identical order-indepent syntax as the syntax used in R. This being said, what I miss about R is gathering all commands on a single page such as
> http://www.tcl.tk/man/tcl8.5/TclCmd/contents.htm
> Note that once you click on any of the commands, a number of classes that extend each command become visible, including the example section(s).
>
> Here I illustrate my use of just five tcl commands that subsequently guided my "design" of the function 'string2vector' in 'vector2string' "R"
>
> # few "Tcl" examples before designing the function 'string2vector' in "R"
> % set binS "10011"
> % join [split $binS ""] ", "
> 1, 0, 0, 1, 1
> %
> % set strS "I \t am\tdone"
> % foreach item [split $strS "\t"] {append strSQ \"$item\",}
> % set strSQ [string trimright $strSQ ,]
> "I "," am","done"
> #
> # few "Tcl" examples before designing the function 'vector2string' in "R"
> % set strV "1,0,0,1"
> 1,0,0,1
> % split $strV ","
> 1 0 0 1
> join [split $strV ","] ":"
> 1:0:0:1
>
> (4) a take on "R" functions 'string2vector' and 'vector2string'
> --------------------------------------------------------------------------------
>
>> # few tests of the function 'string2vector' in "R"
>> binS = "10011"
>> binV = string2vector(binS, SS="", type="int")
>> binV[2] ; binV[5]
>>
> [1] 0
> [1] 1
>
>> strS = "I am done"
>> vecS = string2vector(strS, SS=" ", type="char")
>> vecS[1] ; vecS[3]
>>
> [1] "I"
> [1] "done"
>
>> # few tests of the function 'vector2string' in "R"
>> binV = c(1,0,0,1)
>> vector2string(binV, type="int")
>>
> [1] "1001"
>
>> vector2string(binV, SS=" ", type="char")
>>
> [1] "1 0 0 1"
>
>> subsV = c("I", "am", "done")
>> vector2string(subsV, SS=":", type="char")
>>
> [1] "I:am:done"
>
>
> (5) code and comments for "R" functions 'string2vector' and 'vector2string'
> --------------------------------------------------------------------------------
>
> string2vector = function(string="ch-2 \t sec-7\tex-5", SS="\t", type="char")
> #
> # This procedure splits a string and assigns substrings to an R-vector.
> # The split is controlled by the string separator SS (default value: SS="\t").
> # Here we convert a binary string into a binary vector:
> # let binS = "10011"
> # then binV = string2vector(binS, SS="", type="int")
> # Here we convert a string into a vector of substrings:
> # let strS = "I am done"
> # then vecS = string2vector(strS, SS=" ", type="char")
> #
> # LIMITATION: The function interprets all substrings either as of type
> # "int" or "char". A function that interprets the type of each
> # substring dynamically may one day be written by an R-guru.
> #
> # Franc Brglez, Wed Dec 9 14:19:16 EST 2009
> {
> qlist = strsplit(string, SS) ; qvector = qlist[[1]]
> n = length(qvector) ; xvector = NULL
> for (i in 1:n) {
> if (type == "int") {
> tmp = as.integer(qvector[i])
> } else {
> tmp = qvector[i]
> }
> xvector = c(xvector, tmp)
> }
> return(xvector)
> } # string2vector
>
> vector2string = function(vector=c("ch-2", "sec-7", "ex-5"), SS="_", type="char")
> #
> # This procedure converts values from a vector to a concatenation of substrings
> # separated by user-specified string separator SS (default value: SS="_").
> # Each substring represents a vector component value, either as a numerical
> # value or as an alphanumeric string.
> # Here we convert a binary vector to a binary string representing an integer:
> # let binV = c(1,0,0,1)
> # then strS = vector2string(binV, type="int")
> # Here we convert a binary vector to string representing a binary sequence:
> # let binV = c(1,0,0,1)
> # then seqS = vector2string(binV, SS=" ", type="char")
> # Here we convert a vector of substrings to colon-separated string:
> # let subsV = c("I", "am", "done")
> # then strS = vector2string(subsV, SS=":", type="char")
> #
> # LIMITATION: The function interprets all substrings in the vector either as of
> # type "int" or "char". A function that interprets the type of each
> # substring dynamically may one day be written by an R-guru.
> #
> # Franc Brglez, Wed Dec 9 15:43:59 EST 2009
> {
> if (type == "int") {
> string = paste(strsplit(paste(vector), " "), collapse="")
> } else {
> n = length(vector) ; nm1 = n-1 ; string = ""
> for (i in 1:nm1) {
> tmp = noquote(vector[i])
> string = paste(string, tmp, SS, sep="")
> }
> tmp = noquote(vector[n])
> string = paste(string, tmp, sep="")
> }
> return(string)
> } # vector2string
>
> ----------------
> Dr. Franc Brglez email: brglez_at_ncsu.edu
> Department of Computer Science, Box 8206 http://sitta.csc.ncsu.edu/~brglez
> North Carolina State University TEL: (919) 515-9675
> Raleigh NC 27695-8206 USA
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sat 12 Dec 2009 - 19:03:28 GMT

This archive was generated by hypermail 2.2.0 : Sat 12 Dec 2009 - 22:31:06 GMT