[Rd] sapply improvements

From: Duncan Murdoch <murdoch_at_stats.uwo.ca>
Date: Wed, 04 Nov 2009 15:24:12 -0500

On 11/4/2009 12:15 PM, William Dunlap wrote:
>> -----Original Message-----
>> From: r-devel-bounces_at_r-project.org
>> [mailto:r-devel-bounces_at_r-project.org] On Behalf Of Duncan Murdoch
>> Sent: Wednesday, November 04, 2009 8:47 AM
>> To: michael.m.spiegel_at_gmail.com
>> Cc: R-bugs_at_r-project.org; r-devel_at_stat.math.ethz.ch
>> Subject: Re: [Rd] error in install.packages() (PR#14042)
>>
>> On 11/4/2009 11:05 AM, michael.m.spiegel_at_gmail.com wrote:
>> > Full_Name: Michael Spiegel
>> > Version: 2.10
>> > OS: Windows Vista
>> > Submission from: (NULL) (76.104.24.156)
>> >
>> >
>> > The following error is produced when attempting to call
>> install.packages. Here
>> > is the results of the traceback:
>> >
>> >> source('http://openmx.psyc.virginia.edu/getOpenMx.R')
>> > Error in f(res) : invalid subscript type 'list'
>> >> traceback()
>> > 7: f(res)
>> > 6: available.packages(contriburl = contriburl, method = method)
>> > 5: .install.winbinary(pkgs = pkgs, lib = lib, contriburl =
>> contriburl,
>> > method = method, available = available, destdir = destdir,
>> > dependencies = dependencies, ...)
>> > 4: install.packages(pkgs = c("OpenMx"), repos = repos)
>> > 3: eval.with.vis(expr, envir, enclos)
>> > 2: eval.with.vis(ei, envir)
>> > 1: source("http://openmx.psyc.virginia.edu/getOpenMx.R")
>> >
>> > I've tracked the error down to somewhere in
>> available.packages defined in
>> > src\library\utils\R\packages.R. I am guessing that the
>> error in version 2.10
>> > has something to do with the change: "available.packages()
>> gains a 'filters'
>> > argument for specifying the filtering operations performed
>> on the packages found
>> > in the repositories."
>>
>> I've found the error, and will fix and commit to R-devel and
>> R-patched.
>>
>> For future reference: the problem was that it assigned the result of
>> sapply() to a subset of a vector. Normally sapply() simplifies its
>> result to a vector, but in this case the result was empty, so
>> sapply()
>> returned an empty list; assigning a list to a vector coerced
>> the vector
>> to a list, and then the "invalid subscript type 'list'" came
>> soon after.

> 
> I've run into this sort of problem a lot (0-long input to sapply
> causes it to return list()).  A related problem is that when sapply's
> FUN doesn't always return the type of value you expect for some
> corner case then sapply won't do the expected simplication.  If
> sapply had an argument that gave the expected form of FUN's output
> then sapply could (a) die if some call to FUN didn't return something
> of that form and (b) return a 0-long object of the correct form
> if sapply's X has length zero so FUN is never called.  E.g.,
>    sapply(2:0, function(i)(11:20)[i], FUN.VALUE=integer(1)) # die on
> third iteration
>    sapply(integer(0), function(i)i>0, FUN.VALUE=logical(1)) # return
> logical(0)
> 
> Another benefit of sapply knowing the type of FUN's return value is
> that it wouldn't have to waste space creating a list of FUN's return
> values but could stuff them directly into the final output structure.
> A list of n scalar doubles is 4.5 times bigger than double(n) and the
> factor is 9.0 for integers and logicals.


What do you think of the behaviour of the sapply function below? (I wouldn't put it into R as it is, I'd translate it to C code to avoid the lapply call; but I'd like to get the behaviour right before doing that.)

This one checks that the length() and typeof() results are consistent. If the FUN.VALUE has names, those are used (but it doesn't require the names from FUN to match).

Duncan Murdoch

sapply <- function(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE, FUN.VALUE)
{

     FUN <- match.fun(FUN)
     answer <- lapply(X, FUN, ...)
     if(USE.NAMES && is.character(X) && is.null(names(answer)))
                 names(answer) <- X
     if(simplify) {
     	if (missing(FUN.VALUE)) {
     	    if ((!length(answer))
     	        || length(common.len <- unique(unlist(lapply(answer, 
length)))) != 1L)
     	        return(answer)
     	    common.names <- names(answer[[1L]])
     	} else {
     	    common.len <- length(FUN.VALUE)
     	    common.type <- typeof(FUN.VALUE)
     	    common.names <- names(FUN.VALUE)
     	    if (length(answer)) {
     	    	if (any( unlist(lapply(answer, length)) != common.len ))
     	    	    stop(sprintf("%s values must be of length %d", "FUN", 
common.len))
     	    	if (any( unlist(lapply(answer, typeof)) != common.type ))
     	    	    stop(sprintf("%s values must be of type '%s'", "FUN", 
common.type))
     	    	if (is.null(common.names))
     	    	    common.names <- names(answer[[1L]])
     	    } else if (length(FUN.VALUE) > 1)
     	    	return(array(FUN.VALUE[0], dim=c(common.len, 0),
     	    	             dimnames= if(!is.null(common.names))
     	    	                       list(common.names,character(0))))
     	    else
     	    	return(FUN.VALUE[0])	
     	}
	if(common.len == 1L)
	    unlist(answer, recursive = FALSE)
	else if(common.len > 1L)
	    array(unlist(answer, recursive = FALSE),
		  dim= c(common.len, length(X)),
		  dimnames= if(!(is.null(common.names) &
			         is.null(n2 <- names(answer)))) list(common.names,n2))
	else answer
     } else answer

}

R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 04 Nov 2009 - 20:38:10 GMT

This archive was generated by hypermail 2.2.0 : Wed 04 Nov 2009 - 21:10:24 GMT