Re: [Rd] how to manipulate dput output format

From: andre zege <andre.zege_at_gmail.com>
Date: Mon, 25 Jun 2012 14:17:56 -0400

On Mon, Jun 25, 2012 at 1:08 PM, Simon Urbanek <simon.urbanek_at_r-project.org>wrote:

>
> On Jun 25, 2012, at 11:57 AM, andre zege wrote:
>
> >
> >
> > On Mon, Jun 25, 2012 at 11:17 AM, Simon Urbanek <
> simon.urbanek_at_r-project.org> wrote:
> >
> > On Jun 25, 2012, at 10:20 AM, andre zege wrote:
> >
> > > dput() is intended to be parsed by R so the above is not possible
> without massaging the output. But why in the would would you use dput() for
> something that you want to read in Java? Why don't you use a format that
> Java can read easily - such as JSON?
> > >
> > > Cheers,
> > > Simon
> > >
> > >
> > >
> > >
> > >
> > > Yeap, except i was just working with someone elses choice. Bigmatrix
> code uses dput() to dump desc file of filebacked matrices.
> >
> > Ah, ok, that is indeed rather annoying as it's pretty much the most
> non-portable storage (across programs) one could come up with. (I presume
> you're talking about big.matrix from bigmemory?)
> >
> >
> > > I got some time to do a little hack of reading big matrices nicely to
> java and was looking to some ways of smoothing the edges of parsing .desc
> file a little. I guess i am ok now with parsing .desc with some regex. One
> thing i am still wondering about is whether i really need to convert back
> and forth between liitle endian and big endian. Namely, java platform has
> little endian native byte order, and big matrix code writes stuff in big
> endian. It'd be nice if i could manipulate that by some #define somewhere
> in the makefile or something and make C++ write little endian without byte
> swapping every time i need to communicate with big matrix from java.
> >
> > I think you're wrong (if we are talking about bigmemory) - the
> endianness is governed by the platform as far as I can see. On
> little-endian machines the big matrix storage is little endian and on
> big-endian machines it is big-endian.
> >
> > It's very peculiar that the descriptor doesn't even store the endianness
> - I think you could talk to the authors and suggest that they include most
> basic information such as endianness and, possibly, change the format to
> something that is well-defined without having to evaluate it in R (which is
> highly dangerous and a serious security risk).
> >
> > Cheers,
> > Simon
> >
> >
> >
> > I would assume that hardware should dictate endianness, just like you
> said. However, the fact is that bigmemory writes in different endianness
> than java reads in. I simply compare matrices that i write using bigmemory
> and that I read into java. Unless i transform endianness, i get gargabe,
> and if i swap byte order, i get the same matrix as the one i wrote. So, i
> don't think i am wrong about that, but i am curious about why it happens
> and whether it is possible to let bigmemory code write in natural
> endianness. Then i would not need to transform each double array element
> back and forth.
> >
>
> I think it has to do with the way you read it in Java since Java supports
> either endianness directly. What methods do you use exactly to read it? The
> on-disk storage is definitely native-endian so C/C++/... can simply mmap it
> with no swapping.
>
> Cheers,
> Simon
>
>
>

It's my first week doing Java, actually:),I simply did the following to read binary file

 public static double[] readVector(String fileName) throws IOException{

        FileChannel rChannel = new RandomAccessFile(new File(fileName), "r").getChannel();

        DoubleBuffer dBuf = rChannel.map(FileChannel.MapMode.READ_ONLY, 0, rChannel.size()).asDoubleBuffer();

        double []  vData = new double[(int) rChannel.size()/8];
        dBuf.get(vData);
        return vData;


    }

i just realized that DoubleBuffer is derived from BytBuffer and reading Java 5 doc for ByteBuffer i see "The initial order of a byte buffer is always BIG_ENDIAN".So in fact i just need to check ByteOrder and change it if it's different from native. So, correct code should look like this it seems

    public static double[] readVector(String fileName) throws IOException{

        FileChannel rChannel = new RandomAccessFile(new File(fileName), "r").getChannel();

        MappedByteBuffer mbb= rChannel.map(FileChannel.MapMode.READ_ONLY, 0, rChannel.size());

        if(mbb.order() != ByteOrder.nativeOrder())
            mbb.order(ByteOrder.nativeOrder());

        DoubleBuffer dBuf = mbb.asDoubleBuffer();
        double []  vData = new double[(int) rChannel.size()/8];
        dBuf.get(vData);
        System.out.println(vData);
        return vData;


    }

Sorry for the confusion and thanks for the lesson, Simon :)

        [[alternative HTML version deleted]]



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon 25 Jun 2012 - 18:23:26 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 25 Jun 2012 - 18:30:30 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive