Re: [R] thousand separator (was RE: weight)

From: John Kane <jrkrideau_at_yahoo.ca>
Date: Tue, 01 May 2007 08:56:50 -0400 (EDT)

> Looks very neat, Gabor!
>
> I just cannot fathom why anyone who want to write
> numerics with those
> separators in a flat file. That's usually not for
> human consumption,
> and computers don't need those separators!
>
> Andy

It' often a case of taking what you can get. I've seem myself taking formatted numbers from report intended for reading and then cutting and pasting them into a text editor.

> From: Gabor Grothendieck
> >
> > That could be accomplished using a custom class
> like this:
> >
> > library(methods)
> > setClass("num.with.junk")
> > setAs("character", "num.with.junk",
> > function(from) as.numeric(gsub(",", "", from)))
> >
> >
> > ### test ###
> >
> > Input <- "A B
> > 1,000 1
> > 2,000 2
> > 3,000 3
> > "
> > DF <- read.table(textConnection(Input), header =
> TRUE,
> > colClasses = c("num.with.junk", "numeric"))
> > str(DF)
> >
> >
> >
> > On 4/30/07, Liaw, Andy <andy_liaw_at_merck.com>
> wrote:
> > > Still, though, it would be nice to have the data
> read in
> > correctly in
> > > the first place, instead of having to do this
> kind of
> > post-processing
> > > afterwards...
> > >
> > > Andy
> > >
> > > From: Bert Gunter
> > > >
> > > > Nothing! My mistake! gsub -- not sub -- is
> what you want to
> > > > get 'em all.
> > > >
> > > > -- Bert
> > > >
> > > >
> > > > Bert Gunter
> > > > Genentech Nonclinical Statistics
> > > >
> > > > -----Original Message-----
> > > > From: r-help-bounces_at_stat.math.ethz.ch
> > > > [mailto:r-help-bounces_at_stat.math.ethz.ch] On
> Behalf Of
> > Marc Schwartz
> > > > Sent: Monday, April 30, 2007 10:18 AM
> > > > To: Bert Gunter
> > > > Cc: r-help_at_stat.math.ethz.ch
> > > > Subject: Re: [R] thousand separator (was RE:
> weight)
> > > >
> > > > Bert,
> > > >
> > > > What am I missing?
> > > >
> > > > > print(as.numeric(gsub(",", "",
> "1,123,456.789")), 10)
> > > > [1] 1123456.789
> > > >
> > > >
> > > > FWIW, this is using:
> > > >
> > > > R version 2.5.0 Patched (2007-04-27 r41355)
> > > >
> > > > Marc
> > > >
> > > > On Mon, 2007-04-30 at 10:13 -0700, Bert Gunter
> wrote:
> > > > > Except this doesn't work for "1,123,456.789"
> Marc.
> > > > >
> > > > > I hesitate to suggest it, but gregexpr()
> will do it, as it
> > > > captures the
> > > > > position of **every** match to ",". This
> could be then used
> > > > to process the
> > > > > vector via some sort of loop/apply
> statement.
> > > > >
> > > > > But I think there **must** be a more elegant
> way using
> > > > regular expressions
> > > > > alone, so I, too, await a clever reply.
> > > > >
> > > > > -- Bert
> > > > >
> > > > >
> > > > > Bert Gunter
> > > > > Genentech Nonclinical Statistics
> > > > >
> > > > > -----Original Message-----
> > > > > From: r-help-bounces_at_stat.math.ethz.ch
> > > > > [mailto:r-help-bounces_at_stat.math.ethz.ch] On
> Behalf Of
> > Marc Schwartz
> > > > > Sent: Monday, April 30, 2007 10:02 AM
> > > > > To: Liaw, Andy
> > > > > Cc: r-help_at_stat.math.ethz.ch
> > > > > Subject: Re: [R] thousand separator (was RE:
> weight)
> > > > >
> > > > > One possibility would be to use something
> like the following
> > > > > post-import:
> > > > >
> > > > > > WTPP
> > > > > [1] 1,106.8250 1,336.5138
> > > > >
> > > > > > str(WTPP)
> > > > > Factor w/ 2 levels
> "1,106.8250","1,336.5138": 1 2
> > > > >
> > > > > > as.numeric(gsub(",", "", WTPP))
> > > > > [1] 1106.825 1336.514
> > > > >
> > > > >
> > > > > Essentially strip the ',' characters from
> the factors and
> > > > then coerce
> > > > > the resultant character vector to numeric.
> > > > >
> > > > > HTH,
> > > > >
> > > > > Marc Schwartz
> > > > >
> > > > >
> > > > > On Mon, 2007-04-30 at 12:26 -0400, Liaw,
> Andy wrote:
> > > > > > I've run into this occasionally. My
> current solution is
> > > > simply to read
> > > > > > it into Excel, re-format the offending
> column(s) by
> > unchecking the
> > > > > > "thousand separator" box, and write it
> back out. Not
> > > > exactly ideal to
> > > > > > say the least. If anyone can provide a
> better solution
> > > > in R, I'm all
> > > > > > ears...
> > > > > >
> > > > > > Andy
> > > > > >
> > > > > > From: Natalie O'Toole
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > These are the variables in my file. I
> think the
> > > > variable i'm having
> > > > > > > problems with is WTPP which is of the
> Factor type. Does
> > > > > > > anyone know how to
> > > > > > > fix this, please?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Nat
> > > > > > >
> > > > > > > data.frame': 290 obs. of 5 variables:
> > > > > > > $ PROV : num 48 48 48 48 48 48 48 48
> 48 48 ...
> > > > > > > $ REGION: num 4 4 4 4 4 4 4 4 4 4 ...
> > > > > > > $ GRADE : num 7 7 7 7 7 7 7 7 7 7 ...
> > > > > > > $ Y_Q10A: num 1.1 1.1 1.1 1.1 1.1 1.1
> 1.1 1.1 1.1 1.1 ...
> > > > > > > $ WTPP : Factor w/ 1884 levels
> > > > > > > "1,106.8250","1,336.5138",..: 1544 67
> > > > > > > 1568 40 221 1702 1702 1434 310 310 ...
> > > > > > >
> > > > > > >
> > > > > > > __________________
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --- Douglas Bates <bates_at_stat.wisc.edu>
> wrote:
> > > > > > >
> > > > > > > > On 4/28/07, John Kane
> <jrkrideau_at_yahoo.ca> wrote:
>
=== message truncated ===



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 01 May 2007 - 22:12:47 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 01 May 2007 - 22:31:16 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.