From: William Dunlap <wdunlap_at_tibco.com>

Date: Wed, 04 Nov 2009 13:04:56 -0800

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 04 Nov 2009 - 21:09:00 GMT

Date: Wed, 04 Nov 2009 13:04:56 -0800

*> -----Original Message-----
**> From: William Dunlap
**> Sent: Wednesday, November 04, 2009 12:53 PM
**> To: 'Duncan Murdoch'
**> Cc: r-devel_at_r-project.org
**> Subject: RE: sapply improvements
**>
*

> It looks good on following examples:

*>
**> > z <- split(log(1:10), rep(letters[1:2],c(3,7)))
**> > sapply(z, length, FUN.VALUE=numeric(1))
**> Error in sapply(z, length, FUN.VALUE = numeric(1)) :
**> FUN values must be of type 'double'
**>
**> (I'd like the error to say "... must be of type 'double',
**> not 'integer'", to give the user a fuller diagnosis of
**> the problem.)
*

If this new argument gets used much it may give a push towards getting functions to always return the same type of output. E.g., range(integer(0)) returns a numeric while range(integer(1)) an integer, resulting in:

> z<-split(1:10, cut(log(1:10),breaks=0:4,include.lowest=TRUE)) > # z[[4]] is integer(0) > sapply(z,range,FUN.VALUE=integer(2)) Error in sapply(z, range, FUN.VALUE = integer(2)) : FUN values must be of type 'integer'In addition: Warning messages:

- In min(x) : no non-missing arguments to min; returning Inf
- In max(x) : no non-missing arguments to max; returning -Inf

*>
*

> > sapply(z, range, FUN.VALUE=c(Min=0,Max=0))

*> a b
**> Min 0.000000 1.386294
**> Max 1.098612 2.302585
**>
**> Exactly matching the typeof's and using the names
**> for row.names on matrix output seem good to me.
**>
**> Bill Dunlap
**> Spotfire, TIBCO Software
**> wdunlap tibco.com
**>
**> > -----Original Message-----
**> > From: Duncan Murdoch [mailto:murdoch_at_stats.uwo.ca]
**> > Sent: Wednesday, November 04, 2009 12:24 PM
**> > To: William Dunlap
**> > Cc: michael.m.spiegel_at_gmail.com; r-devel_at_stat.math.ethz.ch
**> > Subject: sapply improvements
**> >
**> > On 11/4/2009 12:15 PM, William Dunlap wrote:
**> > >> -----Original Message-----
**> > >> From: r-devel-bounces_at_r-project.org
**> > >> [mailto:r-devel-bounces_at_r-project.org] On Behalf Of
**> Duncan Murdoch
**> > >> Sent: Wednesday, November 04, 2009 8:47 AM
**> > >> To: michael.m.spiegel_at_gmail.com
**> > >> Cc: R-bugs_at_r-project.org; r-devel_at_stat.math.ethz.ch
**> > >> Subject: Re: [Rd] error in install.packages() (PR#14042)
**> > >>
**> ...
**> > >> For future reference: the problem was that it assigned
**> > the result of
**> > >> sapply() to a subset of a vector. Normally sapply()
**> > simplifies its
**> > >> result to a vector, but in this case the result was empty, so
**> > >> sapply()
**> > >> returned an empty list; assigning a list to a vector coerced
**> > >> the vector
**> > >> to a list, and then the "invalid subscript type 'list'" came
**> > >> soon after.
**> > >
**> > > I've run into this sort of problem a lot (0-long input to sapply
**> > > causes it to return list()). A related problem is that
**> > when sapply's
**> > > FUN doesn't always return the type of value you expect for some
**> > > corner case then sapply won't do the expected simplication. If
**> > > sapply had an argument that gave the expected form of FUN's output
**> > > then sapply could (a) die if some call to FUN didn't return
**> > something
**> > > of that form and (b) return a 0-long object of the correct form
**> > > if sapply's X has length zero so FUN is never called. E.g.,
**> > > sapply(2:0, function(i)(11:20)[i],
**> FUN.VALUE=integer(1)) # die on
**> > > third iteration
**> > > sapply(integer(0), function(i)i>0,
**> FUN.VALUE=logical(1)) # return
**> > > logical(0)
**> > >
**> > > Another benefit of sapply knowing the type of FUN's
**> return value is
**> > > that it wouldn't have to waste space creating a list of
**> FUN's return
**> > > values but could stuff them directly into the final output
**> > structure.
**> > > A list of n scalar doubles is 4.5 times bigger than
**> > double(n) and the
**> > > factor is 9.0 for integers and logicals.
**> >
**> >
**> > What do you think of the behaviour of the sapply function
**> below? (I
**> > wouldn't put it into R as it is, I'd translate it to C code
**> > to avoid the
**> > lapply call; but I'd like to get the behaviour right before
**> > doing that.)
**> >
**> > This one checks that the length() and typeof() results are
**> > consistent.
**> > If the FUN.VALUE has names, those are used (but it doesn't
**> > require the
**> > names from FUN to match).
**> ...
*

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 04 Nov 2009 - 21:09:00 GMT

*
This archive was generated by hypermail 2.2.0
: Wed 04 Nov 2009 - 21:20:20 GMT
*