From: Jason Miller <millerj_at_truman.edu>

Date: Mon 05 Dec 2005 - 11:55:06 EST

Jason E. Miller, Ph.D.

Associate Professor of Mathematics

Truman State University

Kirksville, MO

http://pyrite.truman.edu/~millerj/

660.785.7430

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon Dec 05 12:37:11 2005

Date: Mon 05 Dec 2005 - 11:55:06 EST

Dear R-helpers,

New to R, I'm in the middle of a project that I'm using to force me learn R. I'm running into some behavior that I don't understand, and I need some advice. In the last week I've gotten some great advice from the list on visualizing my data, and I was hoping people could help me get over another barrier I've encountered to my progress.

Before I describe what I'm trying to do and where I'm stuck with R,
let me quickly outline what I need help with:

(1) summing over the non-NA entries in each row of a data frame, and

(1) using na.omit() and na.action() with rows of data from a frame.

I have a data frame that contains information about when my academic department offered courses and their enrollments. The data frame looks something like

sem year C1e C1s C2e C2s

Fall 1991 10 2 NA NA

Spring 1992 3 1 8 1

Summer 1992 NA NA 100 10

where C?e represents a specific course's enrollment that semester and C?s represents the number of sections of that course offered. The frame is filled with integers and NAs. The data frame is of medium size, with about 180 columns and 45 rows.

I need to cull some basic information from this dataset such as:

(1) total number of sections offered each semester (and each year),

(2) total number of credit hours generated each semester (and each

year), and

(3) the student-to-faculty ratio of the department each semester (and

each year).

From a mathematical standpoint, how to do each of these is obvious to me. But having to negotiate working withing data frames and with matrices that have NA entries has really gotten me confused +frustrated. (I have no programming background.)

To calculate (1) above for semester (rows), I know how to select the
"sections" columns using grep(). What I'd like to do is sum the
selected frame's non-NA entries row-by-row. For some reason, I was
able to do this earlier today using the rowsum() function with
na.rm=TRUE, but now it's not working. It complains of non-numeric
entries. (In fact, I was able to use the rowsum() function to
calculate (1) for each year.) When I try to convert the data frame

(or a sub-frame) to a matrix, my integers turn into strings/

characters, and I have no idea what to do with that!

To calculate (2) above for a semester, I know how to select the enrollment columns using grep(). What I'd like to do is calculate the total credits generated by taking the dot product of each row with a vector whose components are the credit hour values of each course in my data frame. Of course, I'd nave to account for the NA values in my data frame, but in the past I've had decent luck with using na.omit() and na.action() to select the non-NA components of a vector. Unfortunately, na.omit is absolutely no working with my dataframe; it just returns the names of all the columns!

Until I get (1) and (2) figured out, I have no hope of figuring out (3).

Thank you for reading this far into this post. If you have any suggestions for how I can get na.omit() and summing to work for me, I'd appreciate hearing from you.

Jason Miller

Jason E. Miller, Ph.D.

Associate Professor of Mathematics

Truman State University

Kirksville, MO

http://pyrite.truman.edu/~millerj/

660.785.7430

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Mon Dec 05 12:37:11 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:41:28 EST
*