[R] text mining problem using TM package

From: Andy Adamiec <pl.rudy_at_gmail.com>
Date: Wed, 18 May 2011 11:54:44 -0500


Hi, Iím using R (TM package) for text mining and Iím having problems filtering articles out of my data set by local meta data.

Here is the code:

*data <- ("C:/Ö /19970331")*

*rs <- ReutersSource(data , encoding = "UTF-8")*

*RC <- VCorpus(DirSource(data), readerControl = list(reader =
readRCV1asPlain,*

*

language = "en_US",*

*

load = TRUE),*

*

     dbControl = list(useDb = TRUE,*

*

      dbName = "texts.db",*

*

      dbType = "DB1"))*

*tm_index(RC, FUN = sFilter, doclevel = F, useMeta = T, "Topics == 'MCAT'")
*

When I use sFilter, I can only filter fields in yellow, I want to filter fields in red, what am I doing wrong?

Thanks, Andy

This is meta data that is attached to each article

Available meta data pairs are:

  Author :

  DateTimeStamp: 1997-03-31

  Description :

  Heading : USA: WHX begins tender offer for Dynamics Corp.

  ID : 476871

  Language : en_US

  Origin : Reuters Corpus Volume 1

User-defined local meta data pairs are:

$Publisher

[1] "Reuters Holdings Plc"

$Topics

[1] "C18" "C181" "CCAT" $Industries

[1] "I22100" "I34000"

$Countries

[1] "USA"         [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 18 May 2011 - 17:42:56 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 18 May 2011 - 18:20:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive