[R] NA problem when use paste function

From: Lu, Jiang <lu.jjane_at_gmail.com>
Date: Wed, 16 Apr 2008 22:29:08 -0400

Dear R helpers,

I was doing a genetic project with two datasets X and Y. There are some IDs in both data sets, and others in either data set. I used "merge(x,y,by="ID",all=TRUE)". The data set Y contains a variable (a genotype) which is also in data X. When I merge X with Y, these two variables were automatically re-named by appending .x and .y to the original variable names. As you can see on the following list, I would like to take whatever available (non-missing non-NA) in X or Y as the final value for the genotype S3Allel1. I used paste() function. However, it converts <NA> to NA as character. Would you please tell me how I can just get the genotype without pasting the NA to it? I checked the document of paste() and noticed that it used as.character() to the vector argument. I guess that is the reason I got "NA" as a string for the new variable I created (S3Allele1). Should I use any other funtion to avoid this problem? Any insight is appreciated!

           ID      S3Allele1.x S3Allele1.y S3Allele1
1       10003           G        <NA>      G NA
2       10004           A        <NA>      A NA
3       10005           A        <NA>      A NA
4       10006           A        <NA>      A NA
5       10007           G        <NA>      G NA
6       10008           A        <NA>      A NA
7       10009           A        <NA>      A NA
8       10010           A        <NA>      A NA
9       10011           A        <NA>      A NA
10      10013           A        <NA>      A NA
11      10014           A        <NA>      A NA
12      10015           A        <NA>      A NA
13      10016           A        <NA>      A NA
14      10017           A        <NA>      A NA
15      10018           A        <NA>      A NA
16      10019           G        <NA>      G NA
17      10020           A        <NA>      A NA
18      10021           G        <NA>      G NA
19      10022           A        <NA>      A NA
20      10023           G        <NA>      G NA
21      10024           G        <NA>      G NA
22      10025           G        <NA>      G NA
23      10027           G        <NA>      G NA
24      10028           G        <NA>      G NA
25      10029           G        <NA>      G NA
26      10031           G        <NA>      G NA
27      10032           A        <NA>      A NA
28      10033        <NA>                   NA
29      10035           A        <NA>      A NA
30      10037           A        <NA>      A NA
31      10038        <NA>           A      NA A
32      10039        <NA>           A      NA A

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 17 Apr 2008 - 04:46:36 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 17 Apr 2008 - 07:30:35 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive