Re: [Rd] sapply improvements

From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>
Date: Wed, 04 Nov 2009 22:16:03 +0100

William Dunlap wrote:

> It looks good on following examples:
> 

>> z <- split(log(1:10), rep(letters[1:2],c(3,7)))
>> sapply(z, length, FUN.VALUE=numeric(1))
> Error in sapply(z, length, FUN.VALUE = numeric(1)) : 
>   FUN values must be of type 'double'
> 
> (I'd like the error to say "... must be of type 'double',
> not 'integer'", to give the user a fuller diagnosis of
> the problem.)

Umm, not following too closely, but would it not be preferable just to coerce in such cases? I can see a lot of issues of the

if (x <= 0) NA else log(x)

variety otherwise.

>> sapply(z, range, FUN.VALUE=c(Min=0,Max=0))

>            a        b
> Min 0.000000 1.386294
> Max 1.098612 2.302585
> 
> Exactly matching the typeof's and using the names
> for row.names on matrix output seem good to me.
>  
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com  
> 

>> -----Original Message-----
>> From: Duncan Murdoch [mailto:murdoch_at_stats.uwo.ca]
>> Sent: Wednesday, November 04, 2009 12:24 PM
>> To: William Dunlap
>> Cc: michael.m.spiegel_at_gmail.com; r-devel_at_stat.math.ethz.ch
>> Subject: sapply improvements
>>
>> On 11/4/2009 12:15 PM, William Dunlap wrote:
>>>> -----Original Message-----
>>>> From: r-devel-bounces_at_r-project.org
>>>> [mailto:r-devel-bounces_at_r-project.org] On Behalf Of Duncan Murdoch
>>>> Sent: Wednesday, November 04, 2009 8:47 AM
>>>> To: michael.m.spiegel_at_gmail.com
>>>> Cc: R-bugs_at_r-project.org; r-devel_at_stat.math.ethz.ch
>>>> Subject: Re: [Rd] error in install.packages() (PR#14042)
>>>>
> ... 

>>>> For future reference: the problem was that it assigned
>> the result of
>>>> sapply() to a subset of a vector. Normally sapply()
>> simplifies its
>>>> result to a vector, but in this case the result was empty, so
>>>> sapply()
>>>> returned an empty list; assigning a list to a vector coerced
>>>> the vector
>>>> to a list, and then the "invalid subscript type 'list'" came
>>>> soon after.
>>> I've run into this sort of problem a lot (0-long input to sapply
>>> causes it to return list()). A related problem is that
>> when sapply's
>>> FUN doesn't always return the type of value you expect for some
>>> corner case then sapply won't do the expected simplication. If
>>> sapply had an argument that gave the expected form of FUN's output
>>> then sapply could (a) die if some call to FUN didn't return
>> something
>>> of that form and (b) return a 0-long object of the correct form
>>> if sapply's X has length zero so FUN is never called. E.g.,
>>> sapply(2:0, function(i)(11:20)[i], FUN.VALUE=integer(1)) # die on
>>> third iteration
>>> sapply(integer(0), function(i)i>0, FUN.VALUE=logical(1)) # return
>>> logical(0)
>>>
>>> Another benefit of sapply knowing the type of FUN's return value is
>>> that it wouldn't have to waste space creating a list of FUN's return
>>> values but could stuff them directly into the final output
>> structure.
>>> A list of n scalar doubles is 4.5 times bigger than
>> double(n) and the
>>> factor is 9.0 for integers and logicals.
>>
>> What do you think of the behaviour of the sapply function below? (I
>> wouldn't put it into R as it is, I'd translate it to C code
>> to avoid the
>> lapply call; but I'd like to get the behaviour right before
>> doing that.)
>>
>> This one checks that the length() and typeof() results are
>> consistent.
>> If the FUN.VALUE has names, those are used (but it doesn't
>> require the
>> names from FUN to match).
> ...
> 
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
    O__  ---- Peter Dalgaard             ุster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard_at_biostat.ku.dk)              FAX: (+45) 35327907

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 04 Nov 2009 - 21:21:26 GMT

This archive was generated by hypermail 2.2.0 : Wed 04 Nov 2009 - 23:40:20 GMT