Re: [Rd] Any interest in "merge" and "by" implementations specifically for sorted data?

From: Kevin B. Hendricks <kevin.hendricks_at_sympatico.ca>
Date: Sun 30 Jul 2006 - 14:11:21 GMT

Hi Bill,

After playing with this some more and adding an implementation to handle NAs in the data vector, I have run into the problem of what to return when the only data values for a particular bin (or level) in the data vector were NAs and the user selected na.rm=T

  1. Should it return 0 for counts of that particular bin and NA for that bin for all of the other functions? If so, wouldn't that be strange to return a NA just since there is no valid data for that bin because the user asked for na.rm=T?
  2. Or do I have to literally rebuild the final result vector, removing all "unused" bins before returning the results? And wouldn't that cause problems in not all of the levels from 1:ngroups will be returned for some variables and not for others.

I personally like the approach of 1. better since if I give an igroup function my groups and tell it to na.rm=T from my data vector, I would really want all group levels returned and not just the ones that had valid data in them and if a particular group had no data, I would want the count to be 0 for that bin and all of the other funs to return NA for that particular bin?

Is that what you are returning in that case?

Also, do you always return Sums, Maxs, and Mins as "numeric" or do you sometimes return "integer" values if an "integer" data vector is passed in?

Are "Counts" always returned as "integer" or do you always set them to "numeric" or does that vary with the type of the data vector passed in?

Do you handle "complex" data vectors in a similar fashion (ie. using the length of the complex vector as its value for Maxs, Mins, etc?)?

Thanks,

Kevin



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon Jul 31 00:23:40 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 30 Jul 2006 - 16:26:47 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.