Re: [R] Decimals in R/SQL

From: jim holtman <jholtman_at_gmail.com>
Date: Wed, 13 Apr 2011 12:51:39 -0400

The problem is that you data is 'integer' and I assume that the database is keeping everything integer. You can do what you are doing, or convert to 'numeric':

> x <- read.table(textConnection(" ST AGEP PWGTP

+ 33323    1                30                 130
+ 33324    1                10                 186
+ 33325    1                 2                 162
+ 33326    1                80                  93
+ 33327    1                29                 135
+ 33328    1                66                  54
+ 33329    1                62                  54
+ 33330    1                21                 138
+ 33331    1                29                 103
+ 33332    1                 7                 144
+ 33333    1                 5                 143"), header = TRUE)

> closeAllConnections()
> str(x)

'data.frame': 11 obs. of 3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : int  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...

> require(sqldf)
> xsum <- sqldf('
+     select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+         from x
+         group by ST
+ ')

> xsum

  ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1                             23

> # change to numeric instead of integer
> x$AGEP <- as.numeric(x$AGEP)
> str(x)

'data.frame': 11 obs. of 3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : num  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...

> xsum <- sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP) + from x + group by ST

+ ')
> xsum

  ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1                       23.81446

>

On Wed, Apr 13, 2011 at 12:42 PM, Rachel Licata <RachelL_at_kff.org> wrote:
> Thanks Jim.  It appears the issue may only be in SQLite.  SS09 is a large table and here is a subset of the variables I am working with.

>
>

> SS09
>            ST                  AGEP                    PWGTP
> 33323    1                30                 130
> 33324    1                10                 186
> 33325    1                 2                 162
> 33326    1                80                  93
> 33327    1                29                 135
> 33328    1                66                  54
> 33329    1                62                  54
> 33330    1                21                 138
> 33331    1                29                 103
> 33332    1                 7                 144
> 33333    1                 5                 143
>

> z <- dbGetQuery( connSQLite , "select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage from ss09 group by ST")
>

> ST wgtage
> 1   1     37
> 2   2     33
> 3   4     36
> 4   5     37
> 5   6     35
>

> z <- dbGetQuery( connSQLite , "select ST, SUM(AGEP*PWGTP)*1.000000000/SUM(PWGTP)*1.00000000000000 as wgtage from ss09 group by ST")
>

> ST   wgtage
> 1   1 37.57083
> 2   2 33.94322
> 3   4 36.14499
> 4   5 37.51233
> 5   6 35.65581
>

> -----Original Message-----
> From: jim holtman [mailto:jholtman_at_gmail.com]
> Sent: Wednesday, April 13, 2011 12:16 PM
> To: Rachel Licata
> Cc: r-help_at_r-project.org
> Subject: Re: [R] Decimals in R/SQL
>

> You at least have to provide a subset of 'ss09' so we can see what the
> original data looks like.  I have not had any problems with decimals
> in using sqldf.
>
>> x <- as.data.frame(matrix(runif(100)*100, 10))
>> x$key <- sample(1:3, 10, TRUE)
>> require(sqldf)
>> xsum <- sqldf('

> +     select key, sum(V1 * V2) / sum(V3)
> +         from x
> +         group by key
> + ')
>>
>>
>> xsum

>  key sum(V1 * V2) / sum(V3)
> 1   1               19.38166
> 2   2               17.40503
> 3   3               71.48818
>> dput(xsum)

> structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
> 17.4050302312273, 71.4881812227571)), .Names = c("key", "sum(V1 * V2) / sum(V3)"
> ), row.names = c(NA, 3L), class = "data.frame")
>>
>
>
>

> On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata <RachelL@kff.org> wrote:
>> Hello,
>>
>> When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
>> The only way I can get decimals is by multiplying by 1.00000, etc.  I
>> have tried manipulating the options, but it is only effective once I
>> multiply by 1.0000.
>>
>> I appreciate any suggestions!
>>
>> Thanks!
>>
>> Example:
>>
>> z <- sqldf ("select ST,
>> SUM(AGEP*PWGTP)*1.000000000/SUM(PWGTP)*1.00000000000000 as wgtage from
>> ss09 group by ST")
>>
>> z shouldn't be whole numbers.
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>

> --
> Jim Holtman
> Data Munger Guru
>

> What is the problem that you are trying to solve?
>
-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 13 Apr 2011 - 16:55:48 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 13 Apr 2011 - 17:40:30 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive