Re: [R] Re gression between adjacent columns - error with NAs

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Wed, 30 Jul 2008 21:47:34 -0400

That's good. Try this:

  1. put set.seed(1) at the top of the code to make it reproducible.
  2. replace body of loop with: sel_col<-SourceMat[, i] out <- try(coef(lm(tt~sel_col, na.action=NULL))) if (!inherits(out, "try-error")) ResultMat[,i] <- out
  3. email tends to wrap long lines so try not putting comments at the end of the line. Put them on a separate line by themselves.

You will still get the error messages but it won't stop at them and will run to completion.

On Wed, Jul 30, 2008 at 5:54 PM, rcoder <mpdotbook_at_gmail.com> wrote:
>
> Hi Gabor,
>
> Thanks for your reply. I've written something that can be copied and pasted
> into your monitor to reproduce the error I am experiencing. Once the loop
> experiences a column full of NAs in SourceMat (column 3), it exits with
> errors, and ResultMat is only partially complete (up to column 2) with o/p
> intercept and slope results.
>
> When I include the 'na.action=NULL' statement, I get the following
> statement:
> Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
> NA/NaN/Inf in foreign function call (arg 1)
>
> When I leave this statement out, I get the following:
> Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
> 0 (non-NA) cases
>
> In either case, ResultMat is only filled up to column 2:
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> [1,] 5.3611056 5.4099400 NA NA NA NA NA NA NA NA
> [2,] -0.8028985 -0.4078084 NA NA NA NA NA NA NA NA
>
>
> ##Code start
> SourceMat<-matrix(data=rnorm(100), ncol=10, nrow=10)
> SourceMat[,3]<-c(NA)
> tt<-time(SourceMat)
> rownum=2
> colnum=10
> ResultMat<-matrix(NA, ncol=colnum, nrow=rownum)
> #loop through each column in the source matrix:
> for (i in 1:10)
> {
> sel_col<-SourceMat[col(SourceMat)==i] #selecting the correct column
> in the matrix in turn
> ResultMat[,i]<-coef(lm(tt~sel_col, na.action=NULL))
> }
> ##Code end
>
> I would be grateful for any suggestions to avoid this problem.
>
> Thanks,
>
> rcoder
>
>
> rcoder wrote:
>>
>> Well, in this case I don't think my original code would have helped
>> much...
>>
>> So, I've rewritten as below. I want to perform regression between one
>> column in a matrix and all other columns in the same matrix. I have a for
>> loop to achieve this, which succeeds in exporting intercept and slope
>> coefficients to a results matrix, except when a column that contains only
>> NAs is reached. Columns partially filled with NAs are handled, but the
>> code exits with errors when a single column is filled with NAs. I inserted
>> the 'na.action=NULL' statement within the lm() construct, but to no avail.
>> I would be very grateful for any advice.
>>
>>>tt<-time(SourceMat)
>>>ResultMat<-matrix(NA, ncol=colnum, nrow=rownum) #creates an o/p
> template matrix
>>
>> #loop through each column in the source matrix:
>>>for (i in 1:5000)
>> {
>> sel_col<-[col(SourceMat)==i] #selecting the correct column in the
>> matrix in turn
>> SourceMat[,i]<-coef(lm(tt~sel_col), na.action=NULL)
>> }
>>
>> Thanks,
>>
>> rcoder
>>
>>
>> Gabor Grothendieck wrote:
>>>
>>> Read the last line of every message to r-help.
>>>
>>> On Tue, Jul 29, 2008 at 6:15 PM, rcoder <mpdotbook_at_gmail.com> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> I am trying to apply linear regression to adjacent columns in a matrix
>>>> (i.e.
>>>> col1~col2; col3~col4; etc.). The columns in my matrix come with
>>>> identifiers
>>>> at the top of each column, but when I try to use these identifiers to
>>>> reference the columns in the regression function using rollapply(), the
>>>> columns are not recognised and the regression breaks down. Is there a
>>>> more
>>>> robust way to reference the columns I need, so that I can apply the
>>>> regression across the matrix; 'by.column', but every other column?
>>>>
>>>> Thanks,
>>>>
>>>> rcoder
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/rolling-regression-between-adjacent-columns-tp18722392p18722392.html
>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>
>>>> ______________________________________________
>>>> R-help_at_r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> ______________________________________________
>>> R-help_at_r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/rolling-regression-between-adjacent-columns-tp18722392p18743632.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 31 Jul 2008 - 01:51:42 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 31 Jul 2008 - 19:33:02 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive