[R] R: R: gstat problem with lidar data

From: Alessandro <alessandro.montaghi_at_unifi.it>
Date: Wed, 16 Jul 2008 16:46:51 -0700

Ciao Dylan,

THANKS for your help. When I arrive in this step "V <- variogram(z~1, d.small)", appear this note:

 Error in gstat(formula = object, locations = locations, data = data) :   l'argomento "data" non è specificato e non ha un valore predefinito (data argument it's not specified and it has not a value definied)

I show you my code. I hope to improve this code in R, because I believe that R is a solution for this new kind of data (lidar). In fact, for ecological, hidrological and other application is more important to study many solution of processing and testing more software and procedures.

Thank you again, your help is very important for me

Ale

*****************************************R**********************************
************************

> testground <- read. table

(file="c:/work_LIDAR_USA/R_kriging/ground26841492694149.txt", header=T, sep=" ")
> library (sp)
> class (testground)

[1] "data.frame"
> coordinates (testground)=~X+Y
> library (gstat)
> class (testground)
[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"
> x <- 1:100000
> sample(x, 100)
  [1] 38465 18997 98968 56905 31535 5297 91034 57374 56148 4407 16033 74842
 [13] 49516 91422 31812 94924 44332 30412 21990 61698 53816 51227 24848 26824
 [25] 95203 20714 28172 60565 61309 24883 14063 19545 45505 24654 99649 92476
 [37] 84208 73181 13319 1559 67268 13935 57486 4162 49480 68167 38897 33295
 [49] 83067 47544 73390 9646 73967 81101 97055 96514 28011 99185 95511 98106
 [61] 86564 9635 58078 72627 2634 77933 80923 19056 13540 30066 66614 35185
 [73] 28856 61629 90387 30456 78108 18232 64321 68473 9021 15150 74326 17764
 [85] 98459 38203 62364 86437 65911 14058 27638 86792 82157 13721 15988 62189
 [97] 47190 912 33741 95151
> d <- data.frame(x=rnorm(100), y=rnorm(100), z=rnorm(100))
> rand_rows <- sample(1:nrow(d), 10)
> d.small <- d[rand_rows, ]
> summary (d.small)

       x                 y                 z          
 Min.   :-1.9838   Min.   :-1.7096   Min.   :-1.8724  
 1st Qu.:-0.5412   1st Qu.:-0.3629   1st Qu.:-1.3087  
 Median : 0.1373   Median : 0.3014   Median :-0.6858  
 Mean   :-0.1825   Mean   : 0.0811   Mean   :-0.5395  
 3rd Qu.: 0.5796 3rd Qu.: 0.8645 3rd Qu.: 0.1156  Max. : 1.1075 Max. : 0.9342 Max. : 1.4642
>


-----Messaggio originale-----
Da: Dylan Beaudette [mailto:dylan.beaudette_at_gmail.com] Inviato: mercoledì 16 luglio 2008 14.23
A: Alessandro
Cc: r-help_at_r-project.org
Oggetto: Re: R: [R] gstat problem with lidar data

On Wednesday 16 July 2008, Alessandro wrote:
> Hey Dylan,
>
> Thank you. I wish to test for my PhD: TIN (ok, with Arcmap), IDW (ok, with
> Arcmap) and kriging model (and other if it is possible) to create DSM and
> DEM, and DCM (DSM-DEM). I tried with gstat in IDRISI, but my PC outs of
> memory.
> I wish improve in R the gstat to develop map surface (in grid format for
> idrisi or arcmap). Unfortunately I have the same problem in R (out of
> memory), because the dataset is big. Therefore I wish create a random sub
> sampling set by 5000,000.00 over points.
> I show you my code (sorry I am a brand new in R)
>
> Data type (in *.txt format)
>
> X y X
> ....... ....... ........
> ....... ....... ........
>
> testground <- read.table
> (file="c:/work_LIDAR_USA/R_kriging/ground26841492694149.txt", header=T,
> sep=" ")
> summary (testground)
> plot(testground[,1],testground[,2])
> library (sp)
> class (testground)
> coordinates (testground)=~X+Y
> library (gstat)
> class (testground)
> V <- variogram(z~1, testground)
>
> When I arrive in this step appear "out of memory"
>
> If do you help me, it's a very pleasure because I stopped my work.
>
> Ale
>

Hi Ale. Please remember to CC the list next time.

Since R is memory-bound (for the most part), you should be summarizing your data first, then loading into R.

If you can install GRASS, I would highly recommend using the r.in.xyz command
to pre-grid your data to a reasonable cell size, such that the resulting raster will fit into memory.

If you cannot, and can somehow manage to get the raw data into R, sampling random rows would work.

# make some data:
x <- 1:100000

# just some of the data
sample(x, 100)

# use this idea to extract x,y,z triplets # from some fake data:
d <- data.frame(x=rnorm(100), y=rnorm(100), z=rnorm(100))

# select 10 random rows:
rand_rows <- sample(1:nrow(d), 10)

# just the selected rows:
d.small <- d[rand_rows, ]

keep in mind you will need enough memory to contain the original data AND your
subset data. trash the original data once you have the subset data with rm().

As for the statistical implications of randomly sampling a point cloud for variogram analysis-- someone smarter than I may be helpful.

Cheers,

Dylan

>
>
> -----Messaggio originale-----
> Da: Dylan Beaudette [mailto:dylan.beaudette_at_gmail.com]
> Inviato: mercoledì 16 luglio 2008 12.45
> A: r-help_at_r-project.org
> Cc: Alessandro
> Oggetto: Re: [R] gstat problem with lidar data
>
> On Wednesday 16 July 2008, Alessandro wrote:
> > Hey,
> >
> >
> >
> > I am a PhD student in forestry science, and I am a brand new in R. I am
> > working with lidar data (cloud points with X, Y and Z value). I wish to
> > create a spatial map with kriging form points cloud. My problem is the
> > Big data-set (over 5,000,000.00 points) and I always went out of memory.
> >
> >
> >
> > Is there a script to create un subset or modify the radius of variogram?
>
> Do you have any reason to prefer kriging over some other, less intensive
> method such as RST (regularized splines with tension)?
>
> Check out GRASS or GMT for ideas on how to grid such a massive point set.
> Specifically the r.in.xyz and v.surf.rst modules from GRASS.
>
> Cheers,

-- 
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 16 Jul 2008 - 23:50:09 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 17 Jul 2008 - 00:32:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive