Re: [R] Find String Between Characters

From: Sparks, John James <jspark4_at_uic.edu>
Date: Sat, 14 May 2011 21:14:14 -0500

Hi Jim,

Thanks for your note.

Unfortunately, when I attempt your solution in my exact setting, I get a weird and slightly different answer.

First, let me be more clear. What I am attempting to do is pull the CIK number out of the information from the web page itself after it has loaded to R (this may not be optimal, but I am new at this), not from the web page reference (as you have done).

So, when I execute the following as per your suggestion:

require(scrapeR)
mmm<-scrape(url="
http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000320193&owner=exclude&count=40")

num <- sub("^.*CIK=([0-9]+).*", "\\1", mmm)

I get
[1] "<pointer: 0x00000000001265c0>"

Is this just a hex representation of the same number, or is something else going on here?

Comments from any and all would be much appreciated.

--John J. Sparks, Ph.D.

On Sat, May 14, 2011 7:57 pm, jim holtman wrote:
> Is this what you want:
>
>> mmm<-"http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000320193&owner=exclude&count=40"
>> num <- sub("^.*CIK=([0-9]+).*", "\\1", mmm)
>> num
> [1] "0000320193"
>>
>
>
> On Sat, May 14, 2011 at 8:20 PM, Sparks, John James <jspark4_at_uic.edu>
> wrote:
>> Dear R Helpers,
>>
>> I am trying to isolate a set of characters between two other characters
>> in
>> a long string file.  I tried some of the examples on the R help pages
>> and
>> elsewhere, but I am not able to get it.  Your help would be much
>> appreciated.
>>
>> require(scrapeR)
>> mmm<-scrape(url="http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0000320193&owner=exclude&count=40")
>> str(mmm)
>>
>> I want to get the number 0000320193 that is between the CIK= and the &.
>>  I
>> have tried
>>
>> g <- grep( "CIK=|&", mmm )
>> and
>> temp<-grep(mmm,\CIK=\&)
>>
>> and variations on these themes, but all won't run or come bask as an
>> empty
>> object.  How can I grab this number?
>>
>> Best wishes,
>> --John J. Sparks, Ph.D.
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
>
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 15 May 2011 - 13:19:23 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 15 May 2011 - 21:50:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive