# Re: [R] Summarize data for MCA (FactoMineR)

From: David Winsemius <dwinsemius_at_comcast.net>
Date: Sun, 27 Apr 2008 15:10:19 +0000 (UTC)

"Nelson Castillo" <nelsoneci_at_gmail.com> wrote in news:2accc2ff0804251655o32686b99j73cf7df37243d08f_at_mail.gmail.com:

> Hi :-)
>
> I'm new to R and I started using it for a project (I'm the CS guy in
> a group of statisticians helping them find out how to solve issues
> as they come out). This is my first post to the list and I am
> starting to learn R.
>
> Well, they were used to doing MCA analysis in other programs where
> the data seems to be preprocessed automatically before running MCA.
>
> So, they need to process a data set that comes with N=1000000 of
> elements, but there are really about N/100 distinct elements over
> all the variables, so the MCA can be run in reasonable time
> summarizing data.
>
> So, the question is:
>
> How can I turn x from:
>
> x <-
> structure(list(weight = c(1, 1, 2, 1, 2), var1 = structure(c(1L,
> 1L, 1L, 1L, 2L), .Label = c("A", "C"), class = "factor"), var2 =
> structure(c(1L,
> 1L, 1L, 1L, 2L), .Label = c("B", "D"), class = "factor")), .Names =
> c("weight", "var1", "var2"), row.names = c(NA, 5L), class =
> "data.frame")
>
> to:
>
> y <-
> structure(list(weihgt = c(5L, 2L), var1 = structure(1:2, .Label =
> c("A", "C"), class = "factor"), var2 = structure(1:2, .Label =
> c("B", "D"), class = "factor")), .Names = c("weihgt", "var1", "var2"
> ), class = "data.frame", row.names = c(NA, -2L))
>
> using R?
>
> That is, from:
>

```>> x

>   weight var1 var2

> 1      1    A    B
> 2      1    A    B
> 3      2    A    B
> 4      1    A    B
> 5      2    C    D
```
>
> to:
>
```>> y

>   weihgt var1 var2

> 1      5    A    B
> 2      2    C    D
```
>
```          aggregate(weight, by=list(var1=var1,var2=var2), sum)
)
#> s.wt
#  var1 var2 x
```

#1 A B 5

#2 C D 2

#then fix names
names(s.wt) <- "weight"

```#> s.wt
#  var1 var2 weight
#1    A    B      5
#2    C    D      2

```
```--
David Winsemius

>
> The idea is that there is one occurrence of "A B" repeated 4 times

> in the original table,
> and it is summarized in the second table, computing the sum of the
> weights.
>
> I solved the problem using Perl, but I'd like to know what I have to
> do it in R.
>
> Regards,
> Nelson.-
>

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help