Re: [R] quick help needed: split a number and "find and replace" type of function that works like in MS excel

From: Ram H. Sharma <sharma.ram.h_at_gmail.com>
Date: Sun, 01 May 2011 21:48:00 -0400

Thank you Steve for the solution: As per your suggestion I spend some time to make it work for 20000 variables.

nvar = 3 # number of variables
ncol<- nvar*2
func1<- function(x) {sapply( strsplit(as.character(x), ""),

                     match, table= c("1","2","3","4",NA))}

mydf1 <- data.frame(t( apply(mydf, 1, func1) )) colnames(mydf1) <- c( paste("x",1:nvar, sep=""))

# the part of your suggestion I could not put to works in this function is:

R> ct1a <- sapply(ct1.char, '[', 1) ## "non-obvious" use of '[' as R> ct1b <- sapply(ct1.char, '[', 2) ## a function is intentional :-)

Can anybody help me to get solution out of it?

Ram H

On Sun, May 1, 2011 at 5:03 PM, Steve Lianoglou < mailinglist.honeypot_at_gmail.com> wrote:

> Hi,
>
> There are a couple of ways to do what you want.
>
> I'll provide the fodder and let you finish the implementation.
>
> On Sun, May 1, 2011 at 4:26 PM, Ram H. Sharma <sharma.ram.h_at_gmail.com>
> wrote:
> > Hi R experts
> >
> > I have a couple of quick question:
> >
> > Q1
> > #my data
> > set.seed(12341)
> > SN <- 1:100
> > pool<- c(12,13,14, 23, 24, 34)
> > CT1<- sample(pool, 100, replace= TRUE)
> > set.seed(1242)
> > CT2 <- sample(pool, 100, replace= TRUE)
> > set.seed(142)
> > CT3 <- sample(pool, 100, replace= TRUE)
> > # the number of variables run to end of coulmn 20000
> > mydf <- data.frame(SN, CT1, CT2, CT3)
> >
> > First question: how can I split 12 into 1 2, 13 into 1 3, 14 into 1
> 4?
> > What I am trying here is to split each number into two and make seperate
> > variable CT1a and CT1b, CT2a and CT2b, CT3a and CT3b.
> >
> > Tried with strsplit () but I believe this works with characters only
>
> You can convert your numbers to characters, if you like. Using your
> dataset, consider:
>
> R> ct1.char <- as.character(mydf$CT1)
> R> ct1.char <- strsplit(as.character(mydf$CT1), '')
> R> ct1a <- sapply(ct1.char, '[', 1) ## "non-obvious" use of '[' as
> R> ct1b <- sapply(ct1.char, '[', 2) ## a function is intentional :-)
> R> head(data.frame(ct1a=ct1a, ct1b=ct1b))
> ct1a ct1b
> 1 3 4
> 2 1 4
> 3 2 3
> 4 1 4
> 5 3 4
> 6 2 3
>
> > Q2
> > Is there any function that works in the same manner as find and replace
> > function MS excel. Just for example, if I want to replace all 1s in the
> > above data frame with "A", 2 with "B". Thus the number 12 will be
> converted
> > to "AB". I tried with car but it very slow as I need to very large
> > dataframe.
>
> Try gsub:
>
> R> head(ct1a)
> [1] "3" "1" "2" "1" "3" "2"
>
> R> head(gsub("1", "A", ct1a))
> [1] "3" "A" "2" "A" "3" "2"
>
> or you can use a "translation table"
>
> R> xlate <- c('1'='A', '2'='B', '3'='C')
> R> head(xlate[ct1a])
> 3 1 2 1 3 2
> "C" "A" "B" "A" "C" "B"
>
> You might also consider not converting your original data into
> characters and splitting off the integers -- you can use modulo
> arithmetic to get each digit, ie:
>
> R> head(mydf$CT1)
> [1] 34 14 23 14 34 23
>
> ## First digit
> R> head(as.integer(mydf$CT1 / 10))
> [1] 3 1 2 1 3 2
>
> ## Second digit
> R> head(mydf$CT1 %% 10)
> [1] 4 4 3 4 4 3
>
> There's some food for thought ..
>
> -steve
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
> | Memorial Sloan-Kettering Cancer Center
> | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>

-- 

Ram H

	[[alternative HTML version deleted]]

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 05 May 2011 - 06:25:08 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 05 May 2011 - 07:00:06 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive