Re: [R] substr or split help needed

From: jim holtman <jholtman_at_gmail.com>
Date: Sun 18 Jun 2006 - 04:24:16 EST

I used your sample of data. When reading in I used 'as.is=TRUE' to prevent the conversion to 'factor'. I then did an explicit conversion to number on the PLAND column.

You should use 'str' to look at the structure of your data frame.

You have backsplashes and your 'print' will not show them. You do see them in the 'str' call.

Because of the backslash, you have to escape they twice (4 of them in the strsplit command). The result is a character matrix that you can then extract the data from.

> x <- read.csv('clipboard', as.is=TRUE)
> x

                                        LID     TYPE       PLAND
1  D:\\Bijou-MC\\Simula_P005_H100_R001.txt   Forest           NA
2  D:\\Bijou-MC\\Simula_P005_H100_R001.txt   Forest         10.2
3  D:\\Bijou-MC\\Simula_P010_H100_R001.txt   Forest          9.2

> str(x)

`data.frame': 3 obs. of 3 variables:
 $ LID : chr " D:\\Bijou-MC\\Simula_P005_H100_R001.txt " " D:\\Bijou-MC\\Simula_P005_H100_R001.txt " " D:\\Bijou-MC\\Simula_P010_H100_R001.txt "  $ TYPE : chr " Forest " " Forest " " Forest "  $ PLAND: chr " NA" " 10.2" " 9.2"
> x$PLAND <- as.numeric(x$PLAND)

Warning message:
NAs introduced by coercion
> str(x)

`data.frame': 3 obs. of 3 variables:
 $ LID : chr " D:\\Bijou-MC\\Simula_P005_H100_R001.txt " " D:\\Bijou-MC\\Simula_P005_H100_R001.txt " " D:\\Bijou-MC\\Simula_P010_H100_R001.txt "  $ TYPE : chr " Forest " " Forest " " Forest "  $ PLAND: num NA 10.2 9.2
> strsplit(x$LID, "\\\\")
[[1]]
[1] " D:"                        "Bijou-MC"
[3] "Simula_P005_H100_R001.txt "

[[2]]
[1] " D:"                        "Bijou-MC"
[3] "Simula_P005_H100_R001.txt "
[[3]]
[1] " D:"                        "Bijou-MC"
[3] "Simula_P010_H100_R001.txt "


> do.call('rbind', strsplit(x$LID, "\\\\"))
[,1] [,2] [,3] [1,] " D:" "Bijou-MC" "Simula_P005_H100_R001.txt "
[2,] " D:" "Bijou-MC" "Simula_P005_H100_R001.txt " [3,] " D:" "Bijou-MC" "Simula_P010_H100_R001.txt "
>

On 6/17/06, Milton Cezar <miltinho_astronauta@yahoo.com.br> wrote:
>
> Dear R-friends
>
> I have several data files with about 1,900 lines (records) each. Im using
> read.table command to read the files. The files looks like
> LID , TYPE
> , PLAND
> D:\Bijou-MC\Simula_P005_H100_R001.txt , Forest , NA
> D:\Bijou-MC\Simula_P005_H100_R001.txt , Forest , 10.2
> D:\Bijou-MC\Simula_P010_H100_R001.txt , Forest , 9.2
> ---
>
> My first problem is that some command (like hist(data$PLAND)) say that the
> data isnt a numeric one. May be because the first PLAND value are NA? When
> I done read.table command I used something link:
> data<-read.table (file="xxx.dat", head=T, sep="\,", na.strings="NA").
>
> Another problem is that I need parse the LID column. When I do "print
> (head(data$LID) I receive the following result (look that the slash was lost
> on the read):
> D:Bijou-MCSimula_P005_H100_R001.txt
> D:Bijou-MCSimula_P005_H100_R001.txt
> D:Bijou-MCSimula_P010_H100_R001.txt
> Its ok to me, but now I need create the P, H and R columns into the "data"
> table as a parse of LID column. When I try use the command
> "p<-substr(data$LID, 19,3)" I got an error message saying that the variable
> is not char one.
>
> Finally, Id like drop the LID column and insert the P, H and R into the
> table.
>
> Thanks for your help!
>
> Kind regards, miltinho
>
>
>
>
>
>
> __________________________________________________
>
>
> [[alternative HTML version deleted]]
>
>
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390 (Cell)
+1 513 247 0281 (Home)

What is the problem you are trying to solve?

	[[alternative HTML version deleted]]


______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Received on Sun Jun 18 04:31:11 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 18 Jun 2006 - 06:10:59 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.