[Rd] ecdf with lots of ties is inefficient (PR#7292)

From: <martin_at_gsc.riken.jp>
Date: Sun 17 Oct 2004 - 16:50:23 EST


Full_Name: Martin Frith
Version: R-2.0.0
OS: linux-gnu
Submission from: (NULL) (134.160.83.73)

I have large vectors containing 100,000 to 20,000,000 numbers. However, they only contain a few hundred *distinct* numbers (e.g. positive integers < 200). When I do ecdf(v), it either runs out of memory, or it succeeds, but when I plot the ecdf with postscript, the output is unnecessarily bloated because the same lines get redrawn many times. The complexity of ecdf should depend on how many distinct numbers there are, not how many total numbers.

This is my first bug report, so forgive me if I've done something stupid!



R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sun Oct 17 16:56:29 2004

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 09:00:36 EST