Re: [R] subsetting a data set

From: Graham Smith <myotisone_at_gmail.com>
Date: Fri 08 Sep 2006 - 11:11:23 GMT

Sian

On 08/09/06, Sean O'Riordain <seanpor@acm.org> wrote:
>
> Hi Graham,
> Try creating a new column with the two levels that you want...
>
> something along the lines of (warning untested!!!)
>
> GQ1[(GQ1$Status == "Expert) | (GQ1$Status == "Ecol"),]$newColumn <-
> "AllEcol"
> GQ1[GQ1$Status == "Stake",]$newColumn <- "Stake"
>
> and then do the
> by(GQ1[,"Max"], list(GQ1$NewColumn), summary)
>
> when in doubt... break the problem into smaller chunks... :-)
>
> cheers,
> Sean
>
> On 08/09/06, Graham Smith <myotisone@gmail.com> wrote:
> > Petr,
> >
> > Thanks again, but the data is GQ1, Max is a variable (column)
> >
> > So I have used
> >
> > by(GQ1[,"Max"], list(GQ1$Status), summary)
> >
> > Which is very good, and is better than the way I did it before by
> > summarising for each status level individually, but that still isn't
> combing
> > the data for Status == "Expert" and Status = "Ecol"
> >
> > So at the moment the status variable has 3 levels Expert, Ecol and
> Stake,
> >

> > I want to analsye that at two levels: Expert and Ecol combined into a
> new
> > level called "AllEcol" and the exsiting level "Stake"
> >
> > It is this combining the levels that has got me stuck.
> >
> > Thanks again,
> >
> > Graham
> >
> > On 08/09/06, Petr Pikal <petr.pikal@precheza.cz> wrote:
> > >
> > > Sorry, I did not notice that in your case Max is not a function but
> > > your data. So probably
> > >
> > > by(Max[, your.columns], list(Max$status), summary)
> > >
> > > is maybe what you want.
> > > HTH
> > > Petr
> > >
> > >
> > > On 8 Sep 2006 at 10:31, Petr Pikal wrote:
> > >
> > > From: "Petr Pikal" <petr.pikal@precheza.cz>
> > > To: "Graham Smith" <myotisone@gmail.com>,
> > > r-help@stat.math.ethz.ch
> > > Date sent: Fri, 08 Sep 2006 10:31:12 +0200
> > > Priority: normal
> > > Subject: Re: [R] subsetting a data set
> > >
> > > > Hi
> > > >
> > > > I am not sure if your Max is the same as max so I am not sure what
> you
> > > > exactly want from your data. However you shall consult ?tapply, ?by,
> > > > ?aggregate and maybe also ?"[" together with chapter 2 in intro
> manual
> > > > in docs directory.
> > > >
> > > > aggregate(data[, some.columns], list(data$factor1, data$factor2),
> max)
> > > >
> > > > will give you maximum for specified columns based on spliting the
> data
> > > > according to both factors
> > > >
> > > > Also connection summary with max is not common and I wonder what is
> > > > your output in this case. I believe that there are six same numbers.
> > > > However R is case sensitive and maybe Max does something different
> > > > from max. In my case it throws an error.
> > > >
> > > > HTH
> > > > Petr
> > > >
> > > > On 8 Sep 2006 at 8:06, Graham Smith wrote:
> > > >
> > > > Date sent: Fri, 8 Sep 2006 08:06:16 +0100
> > > > From: "Graham Smith" <myotisone@gmail.com>
> > > > To: r-help@stat.math.ethz.ch
> > > > Subject: [R] subsetting a data set
> > > >
> > > > > I have a data set called GQ1, which has 20 variables one of which
> is
> > > > > a factor called Status at thre levels "Expert", "Ecol" and "Stake"
> > > > >
> > > > > I have managed to evaluate some of the data split by status using
> > > > > commands like:
> > > > >
> > > > > summary (Max[Status=="Ecol"])
> > > > >
> > > > > BUT how do I produce asummary for Ecol and Expert combined, the
> > > > > only example I can find suggsts I could use
> > > > >
> > > > > summary (Max[Status=="Ecol"& Status=="Expert"]) but that doesn't
> > > > > work.
> > > > >
> > > > > Additionally on the same vein, if I cannot work out how to create
> a
> > > > > new data set that would contain all the data for all the variables
> > > > > but only for the data where Status = Ecol, or where status
> equalles
> > > > > Ecol and Expert.
> > > > >
> > > > > I know this is yet again a very simple problem, but I really can't
> > > > > find the solution in the help or the books I have.
> > > > >
> > > > > Many thanks,
> > > > >
> > > > > Graham
> > > > >
> > > > > [[alternative HTML version deleted]]
> > > > >
> > > > > ______________________________________________
> > > > > R-help@stat.math.ethz.ch mailing list
> > > > >
https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide
> > > > > http://www.R-project.org/posting-guide.html and provide commented,
> > > > > minimal, self-contained, reproducible code.
> > > >
> > > > Petr Pikal
> > > > petr.pikal@precheza.cz
> > > >
> > > > ______________________________________________
> > > > R-help@stat.math.ethz.ch mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> > > > http://www.R-project.org/posting-guide.html and provide commented,
> > > > minimal, self-contained, reproducible code.
> > >
> > > Petr Pikal
> > > petr.pikal@precheza.cz
> > >
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri Sep 08 21:20:28 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Fri 08 Sep 2006 - 11:30:04 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.