Re: [R] vlmc - "In vlmc(traffic.clusters.stationary, cutoff = i) : alphabet with >1-letter strings; trying to abbreviate"

From: Martin Maechler <maechler_at_stat.math.ethz.ch>
Date: Wed, 30 Jun 2010 14:58:45 +0200

>>>>> "CA" == Constantinos Antoniou <constantinos.antoniou.rlists_at_gmail.com>
>>>>>     on Wed, 30 Jun 2010 12:07:16 +0300 writes:
>>>>> "CA" == Constantinos Antoniou <constantinos.antoniou.rlists_at_gmail.com>
>>>>>     on Wed, 30 Jun 2010 12:07:16 +0300 writes:

    CA> Dear all (copying the package author),

    CA> I have a question on the vlmc package. I am trying to
    CA> model a time series, where each element can take one of
    CA> 11 values (the result of some clustering). When I run
    CA> the following command (synthetic data to facilitate
    CA> self-contained example) 

(very good)

> I get the following warning: ("alphabet with >1-letter strings; trying to
> abbreviate")

> +++ START+++

    >> library(VLMC)
    >> a <- floor(runif(1000,0,11))
    >> vc <- vlmc(a,cutoff=5)

> Warning message:
> In vlmc(a, cutoff = 5) :
> alphabet with >1-letter strings; trying to abbreviate
>> vc
> 'vlmc' a Variable Length Markov Chain;
> alphabet 'abcdefghijk', |alphabet| = 11, n = 1000.
> Call: vlmc(dts = a, cutoff.prune = 5)
-> extensions (= $size ) :
> ord.MC context nr.leaves total
> 2 72 61 1608
> AIC = 5247
>>
> +++ END+++

> The questions are:
> 1. What is it trying to do?

Your contains values 0 1 2 .. 10
it tries to match them to 1-letter strings, but '10' "is 2 letters"

> 2. How is it abbreviating?

(not really important: using abbreviate()

> 3. How much should I worry about it?

not at all. The warning is just to inform you that your input looks a bit "unusual" to vlmc.

I do agree however, that one could argue that vlmc() should work for inputs with values

     0:m
or 1:n
without a warning.

> 4. What can I do?

You could use

    vc <- vlmc(letters[1+a], cutoff=5)

to get the exact same model, but without a warning. or

    vc <- vlmc(a, cutoff=5, quiet = TRUE) or

    vc <- vlmc(a, cutoff=5, code1char = FALSE)

> I have looked at the documentation plus
> M<c3><a4>chler M. and B<c3><bc>hlmann P. (2004) Variable Length Markov Chains:
> Methodology, Computing, and Software. _J. Computational and
> Graphical Statistics_ *2*, 435-455.

That's good.
The examples there all have a character vector (of strings with 1 letter/character) as input.

> Thanks for any feedback,

You're welcome!
Martin Maechler, ETH Zurich

> --
> Constantinos Antoniou, Ph.D., Assistant Professor
> National Technical University of Athens
> Laboratory of Transportation Engineering
> School of Rural and Surveying Engineering
> 9 Heroon Politechniou st., 15780-Zografou, Athens, Greece
> T: +30 210 7722783 - F: +30 210 7722629
> antoniou@central.ntua.gr - http://users.ntua.gr/antoniou

> Dear all (copying the package author), I have a question
> on the vlmc package. I am trying to model a time series,
> where each element can take one of 11 values (the result
> of some clustering). When I run the following command
> (synthetic data to facilitate self-contained example) I
> get the following warning: ("alphabet with >1-letter
> strings; trying to abbreviate")

> +++ START+++

    >> library(VLMC) a <- floor(runif(1000,0,11)) vc <-
    >> vlmc(a,cutoff=5)

> Warning message: In vlmc(a, cutoff = 5) : alphabet with
> >1-letter strings; trying to abbreviate
>> vc
> 'vlmc' a Variable Length Markov Chain; alphabet
> 'abcdefghijk', |alphabet| = 11, n = 1000. Call:
> vlmc(dts = a, cutoff.prune = 5)
-> extensions (= $size ) :
> ord.MC context nr.leaves total 2 72 61 1608 AIC =
> 5247
>>
> +++ END+++

> The questions are: 1. What is it trying to do? 2. How
> is it abbreviating? 3. How much should I worry about
> it? 4. What can I do?

> I have looked at the documentation plus M<c3><a4>chler
> M. and B<c3><bc>hlmann P. (2004) Variable Length Markov
> Chains: Methodology, Computing, and Software.
> _J. Computational and Graphical Statistics_ *2*,
> 435-455.

> Thanks for any feedback, Costas

> -- Constantinos Antoniou, Ph.D., Assistant Professor
> National Technical University of Athens Laboratory of
> Transportation Engineering School of Rural and Surveying
> Engineering 9 Heroon Politechniou st., 15780-Zografou,
> Athens, Greece T: +30 210 7722783 - F: +30 210 7722629
> antoniou@central.ntua.gr - http://users.ntua.gr/antoniou
______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 30 Jun 2010 - 13:06:28 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 30 Jun 2010 - 21:00:43 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive