From: Thomas P. Colson <tpcolson_at_ncsu.edu>

Date: Sun 08 Oct 2006 - 21:21:44 GMT

coordsa <- expand.grid(y = seq(yULcorner, yLRcorner, by = -20),x = seq(xULcorner, xLRcorner, by = 20))

area<- data.frame(coordsa, tmin = as.vector(c(area,recursive = T))) names(area)<-c("x","y","z")

Plot(ecdf(area$z))

Date: Sun 08 Oct 2006 - 21:21:44 GMT

About 150,000 data points in these sets, ecdf curve plots in about 15 seconds.

xURcorner <- xLLcorner + (cellsize * (ncol(area) - 1)) xLRcorner <- xURcorner xULcorner <- xLLcorner yULcorner <- yLLcorner + (cellsize * (nrow(area) - 1)) yURcorner <- yULcorner yLRcorner <- yLLcorner

coordsa <- expand.grid(y = seq(yULcorner, yLRcorner, by = -20),x = seq(xULcorner, xLRcorner, by = 20))

area<- data.frame(coordsa, tmin = as.vector(c(area,recursive = T))) names(area)<-c("x","y","z")

Plot(ecdf(area$z))

-----Original Message-----

From: Roger Bivand [mailto:Roger.Bivand@nhh.no]
Sent: Sunday, October 08, 2006 4:37 PM

To: Thomas P. Colson

Cc: r-help@stat.math.ethz.ch

Subject: Re: [R] Probability of exceedance function question

On Sun, 8 Oct 2006, Thomas P. Colson wrote:

> I'm trying to calculate a cumulative area distribution (graph) of

*> drainage areas. This is defined as P(A > A*). Simple in principle. I
**> can do this in excel, with "COUNTIF", which will count the number of
**> cells in the row "area" that have area A, then determine, for each
**> cell in the row "area, how many cells exceede that area, then dividing
**> that number by the total number of cells, which gives me the
**> probability that drainage area A exceeds drainage area A*.
*

Is this ecdf() of the vector or its suitable subset? If so, it runs very fast even for large data sets. For plotting, bear in mind that you are generating a lot of output, though:

> t0 <- runif(100000)

*> system.time(t1 <- ecdf(t0))
*

[1] 0.222 0.022 0.248 0.000 0.000

> system.time(plot(t1, pch="."))

[1] 1.089 0.079 1.186 0.000 0.000

isn't at all bad!

*>
*

> E.g, drainage area of 6 sq meters (One DEM grid cell) has a high

*> probability of exceedance(.99), while a drainage area of 100,000
**> square meters has a low probability of exceedance (.001).
**>
**> I wish to plot this relationship, and we all know that excel is not
**> the tool of choice when working with hundreds of thousands of records.
**> I'd like to port the CAD into a few R functions that I've already
**> developed for other tests as well.
**>
**> So my challenge, in R, is to
**> (1)count the number of rows in column "Area" that have AREA(*),
**>
**> (2) determine, by row, how many rows have an area greater than the
**> area given in that one row
**>
**> (3) divide step 2 by number of rows (how can I do a row count and port
**> that to a variable, as I have to do this on 10 datasets?)
**>
**> Thanks for any advice you can offer to this endevour
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide
**> http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
*

-- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand@nhh.no ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Mon Oct 09 07:23:23 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Mon 09 Oct 2006 - 09:30:09 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*