[R] OT:Selling Data Mining Cloud-A Concept

From: Ajay ohri <ohri2007_at_gmail.com>
Date: Thu, 31 Jul 2008 15:50:50 +0530


The Ohri Framework - Data Mining on
Demand<http://decisionstats.com/2008/the-ohri-framework-data-mining-on-demand/>

Part of the reason SAS continues to enjoy a profitable lead is

  1. Standardized language elements (Data and Procs)

2)Ease of Learning SAS

3) Output Delivery to multiple sources

4) Input from multiple data sources

But the most important reason is the sheer efficiency of the PDV in reading large files . If Excel could load a 300 mb file that easily, it would make a significant dent.Large files are assumed to be used by larger license holders.

Cloud computing could be of help here to languages like R. R is very very good in advanced stats, is free, the packages are peer reviewed. It has little known but very good GUI's too (like rattle). If you place rattle GUI in a cloud , it would use processing power on demand, and output results.SAS wont do it because they charge by the CPU count on this, and thats an idle asset (reaping rewards from programming done long back)

Thus you save on hardware costs and software costs.People pay only when they use the system. But an additional costs is fixed cost of the remote application built to support the framework, including transport bandwidth cost. A sugestion could be to use 1_compressed and encrypted data transfers to and fro from the remote cloud.PGP.com would be of help here

Pay for bandwidth, and cost + small markup for the cloud hosting costs. Economies of scale will ensue.

R's graphical system is superior than than SAS, but it can be tweaked to newer graphical softwares like Silverlight.

The little guy no longer needs to squeeze himself for the big computing power.

I could be totally wrong here, but it may be worth a shot.

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 31 Jul 2008 - 10:27:09 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 31 Jul 2008 - 10:33:01 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive