Data Processing Performance Options

1

Posted by Lawrence Sinclair on 14 Sep 2009 at 04:19

Here are a few of my thoughts on technologies and approaches for achieving better data processing performance in the current technology landscape.

RDBMS
Using mySQL or another RDBMS, performance might be addressed with better indexing or by partitioning the data.
MAP-REDUCE NON-RELATIONAL SYSTEMS
A non-relational approach might be to useHadoopor one of its distributions (such asCloudera). This would allow processing to be distributed anywhere from 3 local machines, or a virtually unlimited (hundreds+++) number of machines on the cloud (such as Amazon EC2). But this is best suited for analytic and data processing tasks that can takes several minutes or hours.
THE BEST OF BOTH WORLDS?!
Somewhere in between these two systems isHadoopDBbyDaniel Abadiof Yale. It uses the Hadoop...

Amazon Web Services Console

0

Posted by Lawrence Sinclair on 10 Jan 2009 at 03:19

Amazon's management console for EC2, now makes it easier than ever to enter the compute cloud. Selecting a machine image and launching and monitoring an instance of a virtual machine is now an intuitive visual experience.