Re: [Rd] Portability and Memory Issues for R-package

From: Duncan Murdoch <murdoch_at_stats.uwo.ca>
Date: Tue 27 Dec 2005 - 20:03:32 GMT

On 12/25/2005 2:35 PM, KNygren@us.imshealth.com wrote:
> I have an upcoming JASA paper with an iid sampling algorithm for Bayesian Generalized Linear models (e.g., Logit, Poisson Regression, and Conditional Logit models with multivariate normal priors). At this point, I have implemented the algorithms in C and hope to make the functions and corresponding source code available through an R package. I have successfully created the code necessary to create and install a package with most of the functions on my local machine (using R CMD check,R CMD build, and R CMD INSTALL). As my code makes extensive use of the GSL matrix library, however, I have some questions regarding portability of my package. I am also running into some memory issues when making repeated calls to my functions which I would hope to be able to fix before making a formal distribution of the package. More specifically, the issues are the following:
>
> I. Portability-
>
> Since I make extensive use of the gsl library in my C code, I have the gsl library installed (within the MinGw directory so it is included in the path) on my local machine. Within the package, I am then including a Makevars file with the following code in order to link to the gsl library:
>
> PKG_LIBS=-lgsl -lgslcblas
>
> I also know that there is an R package (gsl) making use of some gsl functions which contains a Makevars.win file with the following code:

This package requires manual handling to build for Windows, and probably for some other platforms if they don't come with gsl by default.

My recommendation would be to work with its author (Robin Hankin, see the DESCRIPTION file for contact information) to add whatever functions are not already there, and then just make your package depend on the R package, rather than on the GSL library directly.

This will mean that all the manual work that has been done to get gsl to build will not need to be repeated by anyone who wants to install your package.

Duncan Murdoch

> PKG_LIBS=-LF:/MinGW/usr/local/lib -lgsl -lgslcblas
> # CPPFLAGS=-I$(R_HOME)/include -IF:/MinGW/usr/local/include
> PKG_CPPFLAGS=-IF:/MinGW/usr/local/include
>
> For my package to install properly on other machines, however, I take it they would have to have the gsl library files already installed in the proper location (or am I mistaken here?). In order to make it fully portable on other machines, it thus seems like I would need to either include instructions for how to first install the gsl library prior to installation (which would have to be platform specific), or to somehow have the gsl library files installed during the R package installation. Is the latter even possible? If so, how could it be done (the key files are likely the two library files)? I believe the gsl package requires the user to have the gsl library preinstalled.
>
> I guess long-term, an option is for me to rework my C code to eliminate the dependence on the gsl library. This could, however, be a time consuming effort. In the meantime would it be possible to contribute the package with the existing dependence (as I think is the case for the gsl library).
>
> II. Memory Issue-
>
> The functions in my package are generally fast and seem to work well if I make a limited number of calls to them from my R code. If I try to make use of them as part of an R MCMC implementation (say updating each Gibbs block 10,000 times in an R loop), I run into memory issues. Despite the fact that my underlying C code frees memory to all pointers, it does not seem like windows recognizes that the memory has been freed. This is apparent as the Mem Usage for RGUI.exe in the windows task manager keeps growing throughout the loop and the code slows down and eventually makes virtually no progress. I have noticed similar issues in the past when calling Winbugs repeatedly using Gelmans functions, so it is likely not an issue that is coming just from my code.
> I suspect that the memory issues could have something to do with the fact that my C code makes repeated use of the gsl_matrix_alloc and gsl_matrix_free functions rather than the R_alloc function (I suspect that the memory is not Garbage collected). I searched the web and found the following suggestion from Bryan Gouch in response to a similar question posted on the gsl discussion forum.
> "If you want to return an R object containing a gsl_matrix which can be garbage collected then you could use a C++ wrapper, as the C++ interface in R allows the use of separate constructors and destructors. "
> Would this be a possible solution? If so, how can I find information on how to write such wrapper functions that will work for gsl matrices? I must admit that I am not familiar with how the use of separate constructors and destructors would work. If that is not the solution, would anyone have any other ideas as to how I can solve the memory issues.
> Kjell Nygren
>
> Kjell Nygren, Ph.D.
> Director Pricing and Advanced Analytics
> Statistical Services
> IMS Health®
> 960 Harvest Drive, Building A
> Blue Bell, PA 19422 USA
> voice: 610.832.5586 * fax: 610.832.5850
> email: <mailto:knygren@us.imshealth.com>
> www.imshealth.com
>
> The information contained in this communication is confident...{{dropped}}
>
>
>
> ------------------------------------------------------------------------
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed Dec 28 07:18:02 2005

This archive was generated by hypermail 2.1.8 : Tue 27 Dec 2005 - 21:25:39 GMT