Re: [R] Histograms with strings, grouped by repeat count (w/ data)

From: Deepayan Sarkar <deepayan.sarkar_at_gmail.com>
Date: Tue, 19 Jun 2007 11:34:03 -0700

On 6/18/07, Matthew Trunnell <trunnell_at_cognix.net> wrote:
> Aha! So to expand that from the original expression,
>
> > table(table(d$filename, d$email_addr))
>
> 0 1 2 3
> 253 20 8 9
>
> I think that is exactly what I'm looking for. I knew it must be
> simple!!! What does the 0 column represent?

Number of unique filename:email_addr combinations that don't occur in the data.

> Also, does this tell me the same thing, filtered by Japan?
> > table(table(d$filename, d$email_addr, d$country_residence)[d$country_residence=="Japan"])
>
> 0 1 2 3
> 958 5 2 1

No it doesn't.

> length(table(d$filename, d$email_addr, d$country_residence))
[1] 4350
> length(d$country_residence)

[1] 63

You are using an index that is meaningless.

There's an alternative tabulation function that uses a formula interface similar to that used in modeling functions; this might be more transparent for your case:

> count <-

+     xtabs(~filename + email_addr, data = d,
+           subset = country_residence == "Japan")

> xtabs(~count)

count
  0 1 3
284 2 4

> How does that differ logically from this?
>
> > table(table(d$filename, d$email_addr)[d$country_residence=="Japan"])
>
> 0 1 2 3
> 51 4 2 1

This is also using meaningless indexing.

Note, incidentally, that you are indexing a matrix of dimension 10x29 as if it were a vector of length 290, which is probably not what you meant to do anyway:

> str(table(d$filename, d$email_addr))
 'table' int [1:10, 1:29] 1 0 0 0 0 0 0 0 0 0 ...

You need to read help(Extract) carefully and play around with some simple examples.

> I don't understand why that produces different results. The first one
> adds a third dimension to the table, but limits that third dimension
> to a single element, Japan. Shouldn't it be the same? And again,
> what's that zero column?

As before, they are the empty combinations.

-Deepayan



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 19 Jun 2007 - 18:36:53 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 19 Jun 2007 - 19:32:15 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.