Re: [R] Help : delete at random

Date: Wed 02 Mar 2005 - 02:54:21 EST

Might be slightly more interesting. If we want to generate values which are completely missing at random, then we can just simply sample all available index of a 2-d array.

# simulate data #
set.seed(1) # for reproducibility
m <- matrix( rnorm(12), nr=4, nc=3 )
m

[,1] [,2] [,3]

```[1,] -0.6264538  0.3295078  0.5757814
[2,]  0.1836433 -0.8204684 -0.3053884
[3,] -0.8356286  0.4874291  1.5117812
[4,]  1.5952808  0.7383247  0.3898432

indices  <- expand.grid( row=1:nrow(m), col=1:ncol(m) )
# generate all possible indices
N        <- ncol(m)*nrow(m)   # number of total elements

```

Now suppose you want to generate 25% missing values, then

k <- round( 0.25 * N )
w <- as.matrix( indices[ sample( 1:N, k ), ] )  w # shows the row and column numbers that will be imputed    row col
4 4 1
5 1 2
1 1 1

m[ w ] <- NA # impute NAs
m

```           [,1]       [,2]       [,3]
[1,]         NA		NA  0.5757814
[2,]  0.1836433 -0.8204684 -0.3053884
[3,] -0.8356286  0.4874291  1.5117812
[4,]  	     NA  0.7383247  0.3898432

```

On Tue, 2005-03-01 at 15:30 +0100, Uwe Ligges wrote:
> Caroline TRUNTZER wrote:
> > Hello
> > I would like to delete some values at random in a data frame. Does
> > anyone know how I could do?
>
> What about sample()-ing (if I understand "at random" correctly) a
> certain number of values from 1:nrow(data) and using the result as
> negative index the data.frame?
>
> Uwe Ligges
>
>
> > With best regards
> > Caroline
> >
>
