Re: [R] .gct file

From: Marc Schwartz (via MN) <mschwartz_at_mn.rr.com>
Date: Wed 20 Jul 2005 - 04:08:15 EST

For the TAB delimited columns, adjust the 'sep' argument to:

read.table("data.gct", skip = 2, header = TRUE, sep = "\t")

The 'quote' argument is by default:

quote = "\"'"

which should take care of the quoted strings and bring them in as a single value.

The above presumes that the header row is also TAB delimited. If not, you may have to set 'skip = 3' to skip over the header row and manually set the column names.

HTH, Marc Schwartz

On Tue, 2005-07-19 at 13:52 -0400, mark salsburg wrote:
> This is all extremely helpful.
>
> The data turns out is a little atypical, the columns are tab-delemited
> except for the description columns
>
>
> DATA1.gct looks like this
>
> #1.2
> 23 3423
> NAME DESCRIPTION VALUE
> gene1 "a protein inducer" 1123
> ..... ................. ......
>
> How do I get R to read the data as tab delemited, but read in the 2nd
> coloumn as one value based on the quotation marks..
>
> thanks..
>
> On 7/19/05, Marc Schwartz (via MN) <mschwartz@mn.rr.com> wrote:
> > On Tue, 2005-07-19 at 13:16 -0400, mark salsburg wrote:
> > > ok so the gct file looks like this:
> > >
> > > #1.2 (version number)
> > > 7283 19 (matrix size)
> > > Name Description Values
> > > .... ....... ......
> > >
> > > How can I tell R to disregard the first two lines and start reading
> > > the 3rd line in this gct file. I would just delete them, but I do not
> > > know how to open a gct. file
> > >
> > > thank you
> > >
> > > On 7/19/05, Duncan Murdoch <murdoch@stats.uwo.ca> wrote:
> > > > On 7/19/2005 12:10 PM, mark salsburg wrote:
> > > > > I have two files to compare, one is a regular txt file that I can read
> > > > > in no prob.
> > > > >
> > > > > The other is a .gct file (How do I read in this one?)
> > > > >
> > > > > I tried a simple
> > > > >
> > > > > read.table("data.gct", header = T)
> > > > >
> > > > > How do you suggest reading in this file??
> > > > >
> > > >
> > > > .gct is not a standard filename extension. You need to know what is in
> > > > that file. Where did you get it? What program created it?
> > > >
> > > > Chances are the easiest thing to do is to get the program that created
> > > > it to export in a well known format, e.g. .csv.
> > > >
> > > > Duncan Murdoch
> >
> >
> > The above would be consistent with the info in my reply.
> >
> > I guess if the format is consistent, as per Mark's example above, you
> > can use:
> >
> > read.table("data.gct", skip = 2, header = TRUE)
> >
> > which will start by skipping the first two lines and then reading in the
> > header row and then the data.
> >
> > See ?read.table
> >
> > HTH,
> >
> > Marc Schwartz
> >
> >
> >



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jul 20 04:18:04 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:50 EST