“Stampede” is a new 10 petaflop supercomputer being built at the University of Texas at Austin. It is scheduled for deployment in 2013. Check out the Texas Advanced Computing Center article.
The company may only be a few months old, but Platfora has just raised $5.7M to build out a data management platform for open source Hadoop. Their aim is to build tools for analysts to make working with big data easier. Check out the VentureBeat article and the Techcrunch article.
MapR announced a $20 million second round of funding today. Their aim is to bring Hadoop to the enterprise. MapR will use the new funds to scale their operations. Here’s the Venture Beat article.
Check out this VentureBeat article about Facebook open sourcing the hardware in their data center through the Open Compute Project.
Strata describes the process Facebook recently went through to move 30 petabytes of Hadoop data from one data center to another.
If you’re interested in using Hadoop as a tool within your enterprise, it can be quite an endeavor – figuring out what software components you need, what configuration you need, and what hardware it should run on. Lots of people are running different configurations and while the community does share a lot of information, there aren’t many good recaps of hardware being used. Monash Research has a good writeup that also compares how Hadoop hardware has changed over the past couple years.
CMS Wire review’s Pentaho’s newest release of their open source BI software.
Check out Mallet, a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.

