Re: [Rd] dict package: dictionary data structure for R

From: Duncan Temple Lang <duncan_at_wald.ucdavis.edu>
Date: Mon, 23 Jul 2007 04:11:44 -0700

Hi Seth.

Glad you did this. As you know, I think we need more specialized data structures and the ability to be able to introduce them easily into R computations, both internally and at the R language-level.

A few things that come to mind after a quick initial look.

The HashFunc typedef in hashfuncs.h would be more flexible if it took an additional argument of type void * to allow for user defined data. Alternatively, it might take the hash table object itself. The function might want to do some updating of the table itself, or look at some table (e.g. for perfect hashing). And if we had a place to provide additional information, it is easy to allow the hash function object to be an R function.

Also, you are using a "global" table of hash functions (i.e. Dict_HashFunctions) and looking up the C routine using GET_HASHFUN which is tied to the integer indexing for this global table. Why not use the C routines directly from R, i.e. using getNativeSymbolInfo and pass this from R to the newly created dict. This avoids the lookup, the global table and makes things extensible with routines in packages and simply extends to allowing R functions to be passed instead of C routines. It also removes the need to synchronize the labeling system in R and in C, i.e. that 0L corresponds to PJW. The reliance on synchronized names rather than direct handles is unnecessary although widely used in S/R code.

I'm more than happy to give some code to illustrate what I mean more precisely if you'd like it.

  D.

Seth Falcon wrote:
> Hi all,
>
> The dict package provides a dictionary (hashtable) data
> structure much like R's built-in environment objects, but with the
> following differences:
>
> - The Dict class can be subclassed.
>
> - Four different hashing functions are implemented and the user can
> specify which to use when creating an instance.
>
> I'm sending this here as opposed to R-packages because this package
> will only be of interest to developers and because I'd like to get
> feedback from a slightly smaller community before either putting it on
> CRAN or retiring it to /dev/null.
>
> The design makes it fairly easy to add additional hashing functions,
> although currently this must be done in C. If nothing else, this
> package should be useful for evaluating hashing functions (see the
> vignette for some examples).
>
> Source:
> R-2.6.x: http://userprimary.net/software/dict_0.1.0.tar.gz
> R-2.5.x: http://userprimary.net/software/dict_0.0.4.tar.gz
>
> Windows binary:
> R-2.5.x: http://userprimary.net/software/dict_0.0.4.zip
>
>
> + seth
>



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon 23 Jul 2007 - 11:15:31 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 23 Jul 2007 - 19:36:42 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.