From: jim holtman <jholtman_at_gmail.com>

Date: Tue 04 Jul 2006 - 11:59:43 EST

"

# determine which column is the maximum

DF[unlist(x.max)] # select only the unique maximums

Date: Tue 04 Jul 2006 - 11:59:43 EST

Here is a modification of Gabor's solution that will return the dataframe with just the maximum columns:

# test data

# read in header separately so R does not make column names unique
Lines <- "AAA BBB CCC DDD AAA BBB

0 2 1 2 0 0 2 3 7 6 0 1 1.5 4 9 9 6 0 1.0 6 10 11 3 3

"

DF <- read.table(textConnection(Lines), skip = 1) names(DF) <- scan(textConnection(Lines), what = "", nlines = 1)

f <- function(x) x[which.max(colSums(DF[x]!=0))] tapply(seq(DF), names(DF), f)

#================added code================## compute the number of non-zeros in each column MostZeros <- colSums(DF != 0)

# determine which column is the maximum

x.max <- lapply(unique(names(DF)), function(.name){ .col <- which(names(DF) == .name) # find columns of matching names .max <- which.max(MostZeros[.col]) # determine max .col[.max] # return the column number of the max})

DF[unlist(x.max)] # select only the unique maximums

On 7/3/06, Gabor Grothendieck <ggrothendieck@gmail.com> wrote:

*>
*

> Try this:

*>
**> # test data
**> # read in header separately so R does not make column names unique
**> Lines <- "AAA BBB CCC DDD AAA BBB
**> 0 2 1 2 0 0
**> 2 3 7 6 0 1
**> 1.5 4 9 9 6 0
**> 1.0 6 10 11 3 3
**> "
**> DF <- read.table(textConnection(Lines), skip = 1)
**> names(DF) <- scan(textConnection(Lines), what = "", nlines = 1)
**>
**> f <- function(x) x[which.max(colSums(DF[x]!=0))]
**> tapply(seq(DF), names(DF), f)
**>
**> On 7/3/06, markleeds@verizon.net <markleeds@verizon.net> wrote:
**> >
**> > hi everyone :
**> >
**> > suppose i have a matrix in which some column names are identical so,
**> > for example, TEMP
**> >
**> > "AAA", "BBB", "CCC", "DDD","AAA", "BBB"
**> > 0 2 1 2 0 0
**> > 2 3 7 6 0 1
**> > 1.5 4 9 9 6 0
**> > 1.0 6 10 11 3 3
**> >
**> >
**> > I didn't even check yet whether identical column names are allowed
**> > in a matrix but i hope they are.
**> >
**> > assuming that they are, then i would like to be able to take the matrix
**> and make a new matrix with the following requirements.
**> >
**> > 1) whenever there is a unique column name, just take that column for the
**> new matrix
**> >
**> > 2) whenever the column name is not unique, take the one
**> > that has the most non zero elements ? ( in the case of
**> > ties, i don't care which one is picked ).
**> >
**> > so, in this case, the resulting matrix would just be the first 4
**> columns.
**> >
**> > i realize ( or atleast i think ) that
**> > sum( TEMP[(TEMP[,columnname] !=0) ,columnname) will give me the
**> > number of non elements in a column with the name columnmame
**> > but how to use this deal with the non uniqueness to solve my particular
**> problem is beyond me. plus, i think the command will
**> > bomb because columnname will not always be unique ?
**> > Thanks for any help. I realize this is not a trivial problem so I really
**> appreciate it.
**> >
**> > Mark
**> >
**> > ______________________________________________
**> > R-help@stat.math.ethz.ch mailing list
**> > https://stat.ethz.ch/mailman/listinfo/r-help
**> > PLEASE do read the posting guide!
**> http://www.R-project.org/posting-guide.html
**> >
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide!
**> http://www.R-project.org/posting-guide.html
**>
*

-- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What is the problem you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlReceived on Tue Jul 04 12:07:06 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Tue 04 Jul 2006 - 14:13:48 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*