[R] Getting many substrings but only loading the original string one time.

From: Jonathan <jonsleepy_at_gmail.com>
Date: Mon, 11 Apr 2011 15:48:13 -0400


Hi All,

    I'm looking for a way to get many substrings from a longer string and then stitch them together. But, since the longer string is really, really long (like 250 MB long), I don't want to do this in a loop and load and re-load the longer string many times. Does anybody have an idea?

Maybe I could pass in two vectors (the first would have the starting coordinates, and the second would have the stopping coordinates), so it would be like a vectorized version of substr, where start and stop would be vector instead of single integers.

Example (I'm reducing the size of the string for the example) of how this might work:

> longerString <- 'HelloThisIsMyLongerString"
> startVector <- c(2,6,4)
> stopVector <- c(4,10,5)

> substrings <- vectorized_substr(longerString, startVector, stop Vector)
> longerString

[1] "ell" "ThisI" "lo"

Then I'd like to concatenate them (there will be many of them)

> result <- paste(longerString,collapse='')
> result

[1] "ellThisIlo"

(perhaps the paste command as I've done it is the best way, but depending on how the substrings are reported there may be different ways). Thanks!

Jonathan

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 11 Apr 2011 - 19:51:56 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 11 Apr 2011 - 20:40:29 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive