Sep 05

The National Security Agency (NSA) has just made a new submission to the Apache Foundation.  It’s called Accumulo, and it is a key/value data store based on the BigTable paper.  It runs on top of Hadoop, Zookeeper, and Thrift.

Tagged with:
Sep 03

IBM Research has developed a hardware and software solution to join 200,000 hard disks together into a single 120 petabyte storage cluster.  Here’s an article from ExtremeTech and here’s one from O’Reilly Radar with more details.  As of last year, Facebook had the worlds largest Hadoop cluster at 21 petabytes.  This IBM cluster is for a customer, likely a government agency.

Tagged with:
Aug 30

MapR announced a $20 million second round of funding today. Their aim is to bring Hadoop to the enterprise. MapR will use the new funds to scale their operations. Here’s the Venture Beat article.

Tagged with:
Jul 28

Strata describes the process Facebook recently went through to move 30 petabytes of Hadoop data from one data center to another.

Tagged with:
Jun 05

If you’re interested in using Hadoop as a tool within your enterprise, it can be quite an endeavor – figuring out what software components you need, what configuration you need, and what hardware it should run on. Lots of people are running different configurations and while the community does share a lot of information, there aren’t many good recaps of hardware being used. Monash Research has a good writeup that also compares how Hadoop hardware has changed over the past couple years.

Tagged with:
Mar 24

Cloudera has a post on their blog by one of the engineers from Rapleaf about transitioning their infrastructure from MySQL to Hadoop and then benefits it offered for scaling.

Tagged with:
Mar 19

Scobleizer brings us this interview with Jack Levin talking about using NVidia GPU cards in their Hadoop environment and getting 30X improvement in search performance.



Tagged with:
Jan 29

Here’s a post from the Netflix Tech blog talking about their usage of NOsql technologies including Amazon SimpleDB, Hadoop, HBase, and Cassandra. Of special interest was the mention of Datastax, a company offering commercial support for Cassandra.

Tagged with:
Jan 23

Check out BackType and their analytical capabilities for social media. Their platform in running on Hadoop and they are using Cascading. Here’s an O’reilly Radar review.



Tagged with:
Dec 22

Here’s a presentation from the QCon conference about the architecture used by Quantcast to process 100s of TB of data daily using Hadoop.

Tagged with:
preload preload preload