Re: [R] SQL INSERT using RMySQL

From: Gregory Warnes <gregory.warnes_at_mac.com>
Date: Sat, 12 Apr 2008 14:40:20 -0400

Hi All,

I figured out my problem. There was a combination of lack of understanding on my part, and a bit of missing functionality. I made a small patch to the rmysqlWriteTable() function passes the field names to MySQL corresponding to the data columns passed in:

diff -ru RMySQL.orig/R/MySQLSupport.R RMySQL/R/MySQLSupport.R

--- RMySQL.orig/R/MySQLSupport.R	2007-05-31 22:36:02.000000000 -0400
+++ RMySQL/R/MySQLSupport.R	2008-04-11 17:50:29.000000000 -0400
@@ -616,7 +616,9 @@
     on.exit(unlink(fn), add = TRUE)
     sql4 <- paste("LOAD DATA LOCAL INFILE '", fn, "'",
                    " INTO TABLE ", name,
-                  " LINES TERMINATED BY '\n' ", sep="")
+                  " LINES TERMINATED BY '\n' ",
+                  " ( ", paste(names(field.types), collapse=", "),  
");",
+                 sep="")
     rs <- try(dbSendQuery(new.con, sql4))
     if(inherits(rs, ErrorClass)){
        warning("could not load data into table")

I also defined a useful function for describing the structure of an existing table:

setGeneric(

            "dbDescribeTable",
            function(conn, name, ...)
              standardGeneric("dbDescribeTable"),
            valueClass = "character"
            )


setMethod(
           "dbDescribeTable",
           signature(conn="MySQLConnection", name="character"),
           def = function(conn, name, ...){
             rs <- dbGetQuery(conn, paste("describe", name))
             fields <- rs$Type
             names(fields) <- rs$Field
             if(length(fields)==0)
               fields <- character()
             fields
           },
           valueClass = "character"
           )

And I now have working code:

> ## Columns in the table
> dbDescribeTable(con, "past_purchases")

                 id        customer_id           item_upc
"int(10) unsigned"          "int(11)"       "bigint(12)"
          suggested           quantity              total
       "tinyint(1)"          "int(11)"          "int(11)"
            on_sale       actual_price           featured
       "tinyint(1)"           "double"       "tinyint(1)"
               date
             "date"

>
> ## columns in my data (note the absence of the primary key 'id')
> head(fulldata)

   customer_id item_upc suggested quantity total on_sale
1           3 11111111632     FALSE        1     1   FALSE
2           3 11111111733     FALSE        1     1   FALSE
3           3 11111116095     FALSE        1     1   FALSE
4           3 11111117164     FALSE        1     1   FALSE
5           3 11111117210     FALSE        1     1   FALSE
6           3 11111119092     FALSE        1     1   FALSE
   actual_price featured       date
1        10.49    FALSE 2008-03-22
2         4.99    FALSE 2008-03-22
3         5.49    FALSE 2008-03-22
4         9.99    FALSE 2008-03-22
5         4.19    FALSE 2008-03-22
6         3.99    FALSE 2008-03-22

>
> dim(fulldata)

[1] 75 9
>
>
> ## Size of the table before adding my data
> dbGetQuery(con, "SELECT COUNT(ID) FROM past_purchases")[1,1]
[1] 675
>
> ## Insert the data
> dbWriteTable(
+              con,
+              "past_purchases",
+              value=fulldata,
+              overwrite=FALSE,
+              append=TRUE,
+              row.names=FALSE #,
+              #field.types=field.types
+              )

[1] TRUE
>
> ## Size of the table after adding my data
> dbGetQuery(con, "SELECT COUNT(ID) FROM past_purchases")[1,1]
[1] 750

-Greg

On Apr 11, 2008, at 10:57PM , Chris Stubben wrote:
>
> Greg,
>
> If you have a MySQL table with an auto_increment field, you could just
> insert a NULL value into that column and the database will
> increment the key
> (it may not work in SQL STRICT mode, I'm not sure). I don't think
> there's
> any way to specify which columns you want to load data into using
> dbWriteTable yet, but that would be a nice feature since LOAD data now
> allows that (and SET syntax and other options).
>
> Try this code below - I used cbind(NA, x) to insert a null into the
> first
> column.
>
> Chris
>
>> dbSendQuery(con, "create table tmp (id int not null auto_increment
>> primary
>> key, a char(1), b char(1))")
> <MySQLResult:(369,1,67)>
>> x<-data.frame( a=letters[1:3], b=letters[4:6])
>> x
> a b
> 1 a d
> 2 b e
> 3 c f
>> dbWriteTable(con, "tmp", cbind(NA, x), row.name=FALSE, append=TRUE)
> [1] TRUE
>> dbWriteTable(con, "tmp", cbind(NA, x), row.name=FALSE, append=TRUE)
> [1] TRUE
>> dbReadTable(con, "tmp")
> id a b
> 1 1 a d
> 2 2 b e
> 3 3 c f
> 4 4 a d
> 5 5 b e
> 6 6 c f
>
>
>
>
>
> Gregory. R. Warnes wrote:
>>
>> Hi All,
>>
>> I've finally gotten around to database access using R. I'm happily
>> extracting rows from a MySQL database using RMySQL, but am having
>> problems appending rows to an existing table.
>>
>> What I *want* to do is to append rows to the table, allowing the
>> database to automatically generate primary key values. I've only
>> managed to add rows by using
>>
>> dbWriteTable( con, "past_purchases", newRecords, overwrite=FALSE,
>> append=TRUE, ...)
>>
>> And this only appears to properly append rows (as opposed to
>> overwriting them) IFF
>> 1) the row names for newRecords are new unique primary key values,
>> 2) the argument row.names is TRUE.
>>
>> If row.names is FALSE, the records will not be appended, even if
>> newRecords contains a column (named 'id') of unique values that
>> corresponding to the primary key (named 'id').
>>
>> It appears that in this case, the row names on the data frame are
>> still being used for the primary key, and since overwrite is FALSE,
>> the new records are being silently dropped.
>>
>>
>> I did manage to get things working by doing the following:
>>
>> ## get the last used id value (primary key)
>> maxId <- dbGetQuery(con, "SELECT MAX(id) FROM past_purchases")[1,1]
>> maxId
>> if(is.na(maxId)) maxId <- -1
>>
>> ## add the new unique primary keys as row names
>> rownames(fulldata) <- maxId + 1:nrow(fulldata)
>>
>> ## now write out the data
>> dbWriteTable(con, "past_purchases", value=fulldata, overwrite=FALSE,
>> append=TRUE, row.names=TRUE)
>>
>>
>> Is there a better way to accomplish this task? (Session info is
>> below)
>>
>> Thanks!,
>>
>> -Greg
>>
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/SQL-INSERT-
> using-RMySQL-tp16640280p16644954.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 14 Apr 2008 - 01:20:18 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 14 Apr 2008 - 01:30:31 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive