Re: [Rd] ecdf with lots of ties is inefficient (PR#7292)

From: <p.dalgaard_at_biostat.ku.dk>
Date: Sun 17 Oct 2004 - 19:27:24 EST


Prof Brian Ripley <ripley@stats.ox.ac.uk> writes:

> vals <- sort(unique(x))
> y <- tabulate(match(x, vals))
> rval <- approxfun(vals, cumsum(y)/n, method = "constant", yleft = 0,
> yright = 1, f = 0, ties = "ordered")
>
> should work better for you and may be little slower if there are no ties,
> but will use more memory.

...and if all you need is the plot, continue Brian's code with

  Fv <- c(0,cumsum(y))/sum(y)
  xx <- c(vals[1],vals)
  plot(xx, Fv, type="s")

which might well be close enough for your purposes. Or, of course,

  Fs <- stepfun(vals,c(0,cumsum(y)/sum(y)))   plot(Fs,verticals=FALSE)

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907

______________________________________________
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Sun Oct 17 19:35:05 2004

This archive was generated by hypermail 2.1.8 : Wed 03 Nov 2004 - 22:45:22 EST