Re: [R] data frames, na.omit, and sums

From: Petr Pikal <>
Date: Tue 06 Dec 2005 - 00:05:39 EST


try to
> PLEASE do read the posting guide!

I guess you probably need aggregate function like

aggregate(your.df[,-(1:2)], list(semestr = your.df$sem, year= your.df$year), sum, na.rm=T)

Simple working example what you have done, what was Response and how it failed your expectations could be helpful.


On 4 Dec 2005 at 18:55, Jason Miller wrote:

From:           	Jason Miller <>
Date sent:      	Sun, 4 Dec 2005 18:55:06 -0600
Subject:        	[R] data frames, na.omit, and sums

> Dear R-helpers,
> New to R, I'm in the middle of a project that I'm using to force me
> learn R. I'm running into some behavior that I don't understand, and
> I need some advice. In the last week I've gotten some great advice
> from the list on visualizing my data, and I was hoping people could
> help me get over another barrier I've encountered to my progress.
> Before I describe what I'm trying to do and where I'm stuck with R,
> let me quickly outline what I need help with: (1) summing over the
> non-NA entries in each row of a data frame, and (1) using na.omit()
> and na.action() with rows of data from a frame.
> I have a data frame that contains information about when my academic
> department offered courses and their enrollments. The data frame
> looks something like
> sem year C1e C1s C2e C2s
> Fall 1991 10 2 NA NA
> Spring 1992 3 1 8 1
> Summer 1992 NA NA 100 10
> where C?e represents a specific course's enrollment that semester and
> C?s represents the number of sections of that course offered. The
> frame is filled with integers and NAs. The data frame is of medium
> size, with about 180 columns and 45 rows.
> I need to cull some basic information from this dataset such as:
> (1) total number of sections offered each semester (and each year),
> (2) total number of credit hours generated each semester (and each
> year), and (3) the student-to-faculty ratio of the department each
> semester (and each year).
> From a mathematical standpoint, how to do each of these is obvious
> to me. But having to negotiate working withing data frames and with
> matrices that have NA entries has really gotten me confused
> +frustrated. (I have no programming background.)
> To calculate (1) above for semester (rows), I know how to select the
> "sections" columns using grep(). What I'd like to do is sum the
> selected frame's non-NA entries row-by-row. For some reason, I was
> able to do this earlier today using the rowsum() function with
> na.rm=TRUE, but now it's not working. It complains of non-numeric
> entries. (In fact, I was able to use the rowsum() function to
> calculate (1) for each year.) When I try to convert the data frame
> (or a sub-frame) to a matrix, my integers turn into strings/
> characters, and I have no idea what to do with that!
> To calculate (2) above for a semester, I know how to select the
> enrollment columns using grep(). What I'd like to do is calculate
> the total credits generated by taking the dot product of each row
> with a vector whose components are the credit hour values of each
> course in my data frame. Of course, I'd nave to account for the NA
> values in my data frame, but in the past I've had decent luck with
> using na.omit() and na.action() to select the non-NA components of a
> vector. Unfortunately, na.omit is absolutely no working with my
> dataframe; it just returns the names of all the columns!
> Until I get (1) and (2) figured out, I have no hope of figuring out
> (3).
> Thank you for reading this far into this post. If you have any
> suggestions for how I can get na.omit() and summing to work for me,
> I'd appreciate hearing from you.
> Jason Miller
> ================================================================
> Jason E. Miller, Ph.D.
> Associate Professor of Mathematics
> Truman State University
> Kirksville, MO
> 660.785.7430
> ______________________________________________
> mailing list
> PLEASE do read the posting guide!

Petr Pikal mailing list PLEASE do read the posting guide! Received on Tue Dec 06 00:12:33 2005

This archive was generated by hypermail 2.1.8 : Tue 06 Dec 2005 - 02:27:59 EST