Re: [R] RegExp question

From: Andrej <andrej.kastrin_at_gmail.com>
Date: Wed, 16 Jun 2010 10:05:06 -0700 (PDT)

Sorry, I apologize. Below is the minimal example.

library(RWeka)
model <- J48(as.factor(Species)~., data = iris)
> model

J48 pruned tree


Petal.Width <= 0.6: setosa (50.0)
Petal.Width > 0.6

|   Petal.Width <= 1.7
|   |   Petal.Length <= 4.9: versicolor (48.0/1.0)
|   |   Petal.Length > 4.9
|   |   |   Petal.Width <= 1.5: virginica (3.0)
|   |   |   Petal.Width > 1.5: versicolor (3.0/1.0)
|   Petal.Width > 1.7: virginica (46.0/1.0)

Number of Leaves  : 	5

Size of the tree : 	9

So, the task is to extract the number of leases.

Andrej

On Jun 16, 6:58 pm, David Winsemius <dwinsem..._at_comcast.net> wrote:
> Publicly produce something we can work with. I have no idea how to  
> create an example that will match such an object.
>
> ?dput
> ?dump
>
> Read Posting Guide.
> --
> David.
>
> On Jun 16, 2010, at 12:54 PM, Andrej wrote:
>
>
>
> > Thanks David for your fast reply, but now I realized tat "string" is
> > of type:
>
> >> class(string)
> > [1] "jobjRef"
> > attr(,"package")
> > [1] "rJava"
>
> > so I get an error when i try with gsub or sub:
>
> >> sub("^.+\\t(\\d+)\\n.+$", "\\1", string)
> > Error in as.character.default(x) :
> >  no method for coercing this S4 class to a vector
>
> > I think that there should be trivial solution, but... Any further
> > idea?
>
> > Regards, Andrej
>
> > On Jun 16, 6:47 pm, David Winsemius <dwinsem..._at_comcast.net> wrote:
> >> On Jun 16, 2010, at 12:04 PM, Andrej wrote:
>
> >>> Dear all,
>
> >>> I'm trying to filter out the "number of leaves" (it should be 1 in  
> >>> the
> >>> example below) from the following string:
>
> >>>> string
> >>> [1] "Java-Object{J48 pruned tree\n------------------\n: 0  
> >>> (15.0/3.0)\n
> >>> \nNumber of Leaves  : \t1\n\nSize of the tree : \t1\n}"
>
> >>> Any idea how to do that as simple as possible? Thanks in advance for
> >>> any advice.
>
> >> ?sub   # or ?gsub if you need more than one pattern matched (they are
> >> on the same page).
>
> >> This should find the first occurrence of digits following a tab
> >> terminated by a line feed and then return only the digits:
>
> >> string <- "Java-Object{J48 pruned tree\n------------------\n: 0
> >> (15.0/3.0)\n \nNumber of Leaves  : \t1\n\nSize of the tree : \t1\n}"
> >> sub("^.+\\t(\\d+)\\n.+$", "\\1", string)
> >> [1] "1"
>
> >> The parens within the search pattern are matched to "\\1". Need to
> >> double backslashed within patterns.
>
> >>> Regards, Andrej
>
> >> --
>
> >> David Winsemius, MD
> >> West Hartford, CT
>
> >> ______________________________________________
> >> R-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/
> >> listinfo/r-help
> >> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> > ______________________________________________
> > R-h..._at_r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 16 Jun 2010 - 17:07:53 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 16 Jun 2010 - 18:20:32 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive