Re: [R] read.csv fails to read a CSV file from google docs

From: Philipp Pagel <p.pagel_at_wzw.tum.de>
Date: Fri, 29 Apr 2011 20:27:45 +0200

On Fri, Apr 29, 2011 at 06:19:24PM +0300, Tal Galili wrote:
>
> data_url <- "
> http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> "
> read.csv(data_url)
> Error in file(file, "rt") : cannot open the connection

I get the same error (R 2.11.1, Debian LINUX) and don't have a solution. But I did some tests and found the origin of the problem

I can download the file from google with wget but get some interesting ´information in the process:



$ wget -v 'http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv' --2011-04-29 20:07:40-- http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv Resolving spreadsheets0.google.com... 209.85.148.139, 209.85.148.113, 209.85.148.138, ... Connecting to spreadsheets0.google.com|209.85.148.139|:80... connected. HTTP request sent, awaiting response... 302 Moved Temporarily Location: https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv [following] --2011-04-29 20:07:41-- https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv Connecting to spreadsheets0.google.com|209.85.148.139|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/plain]
Saving to: “pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv.1”
    [ <=>                                                                                                       ] 41          --.-K/s   in 0s      

2011-04-29 20:07:42 (342 KB/s) - “pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv.1” saved [41]


The message that caught my attention was the http redirection: "302 Moved Temporarily".

If you try again with the new url you get this:

> read.csv(url("https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&g"))
Error in open.connection(file, "rt") : cannot open the connection In addition: Warning message:
In open.connection(file, "rt") : unsupported URL scheme

?url told me "Note that ‘https://%e2%80%99 connections are not supported." Case closed, problem unsolved...

Dirty workaround: use system() and wget or whatever command is available on Windows for this.

cu

        Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 29 Apr 2011 - 18:31:28 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 29 Apr 2011 - 19:10:35 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive