Re: [Rd] [R] HTTP User-Agent header

From: Robert Gentleman <rgentlem_at_fhcrc.org>
Date: Fri 28 Jul 2006 - 19:52:27 GMT

OK, that suggests setting at the options level would solve both of your problems and that seems like the best approach. I don't really want to pass this around as a parameter through the maze of functions that might actually download something if we don't have to.

I think we can provide something early next week on R-devel for folks to test. But I suspect that as Henrik also does, the set of sites that will refuse us with a User-Agent header will be much larger than those that James has found that refuse us without it.

best wishes

   Robert

Henrik Bengtsson wrote:
> On 7/28/06, Robert Gentleman <rgentlem@fhcrc.org> wrote:

>> I wonder if it would not be better to make the user agent string
>> something that is configurable (at the time R is built) rather than at
>> run time. This would make Seth's patch about 1% as long. Or this could
>> be handled as an option. The patches are pretty extensive and allow for
>> setting the agent header by setting parameters in function calls (eg
>> download.files). I am not sure there is a good use case for that level
>> of flexibility and the additional code is substantial.
>>
>>
>> The issue that I think arises is that there are potentially other
>> systems that will be unhappy with R's identification of itself and so
>> some users may also need to turn it off.
>>
>> Any strong opinions?

>
> Actually two:
>
> 1) If you wish to pull down (read extract from HTML or similar) live
> data from the web, you might want to be able to "immitate" a certain
> browser. For instance, if you tell some webserver you're a simple
> "mobile phone" or "lynx", you might be able get back very clean data.
> Some servers might also block unknown web browsers.
>
> 2) If the webserver of a package reprocitory decided to make use of
> the user-agent string to decide what version of the reprocitory it
> should deliver, I would like to be able to trick the server. Why?
> Many times I found myself working on a system where I do not have the
> rights to update to the latest or the developers version of R.
> However, although I have not the very latest version of R you can do
> work. For instance, in Bioconductor the biocLite() & co gives you
> either the stable or the developers of Bioconductor depending on your
> R version, but looking into the biocLite() code and beyond, you find
> that you actually can install a Bioconductor v1.9 package in R v2.3.1.
> It can be risky business, but if you know what you're doing, it can
> save your day (or week).
>
> Cheers
>
> Henrik
>
>>
>> James P. Howard, II wrote:
>>> On 7/28/06, Seth Falcon <sfalcon@fhcrc.org> wrote:
>>>
>>>> I have a rough draft patch, see below, that adds a User-Agent header
>>>> to HTTP requests made in R via download.file.  If there is interest, I
>>>> will polish it.
>>> It looks right, but I am running under Windows without a compiler.
>>>
>> --
>> Robert Gentleman, PhD
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M2-B876
>> PO Box 19024
>> Seattle, Washington 98109-1024
>> 206-667-7700
>> rgentlem@fhcrc.org
>>
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>

>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem@fhcrc.org

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Sat Jul 29 05:55:35 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 31 Jul 2006 - 02:27:49 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.