Posts Tagged ‘hadoop’

Examples of how the enterprise uses Hadoop

12th October 2009 by No Comments

Monash Research lists off how some Cloudera customers are using Hadoop.

Is Cloudera the new Red Hat?

5th October 2009 by No Comments

As the Open Source software movement continues the strengthen, questions abound about where the opportunities to create commercially viable solutions. Red Hat did it with Linux. Can Cloudera do it with Hadoop? Read this GigaOm article.

Executive overview of Hadoop from Cloudera

9th July 2009 by No Comments

The guys from Cloudera put together the following executive overview of what Hadoop can do for big data.

Hadoop and Big Data 1: Challenging Old Assumptions from Cloudera on Vimeo.

MapReduce faces criticism

20th May 2009 by No Comments

MapReduce has been facing some criticism based on some recent performance tests. Don’t worry. The outcome is basically not to compete with DBMS in areas where DBMS is already good. The article suggests MapReduce should be used to solve the following problem types:

text tokenization, indexing, and search
Creation of other kinds of data [...]

10 MapReduce tips

20th May 2009 by No Comments

Cloudera put together this list of 10 MapReduce tips.
You might also want to check out their list of 5 common Hadoop questions.

Web analytics databases keep getting bigger

30th April 2009 by No Comments

Who has the biggest database? Due to the increasing amount of behavioral information tracked during a web browsing session, some internet properties are starting to rack up some pretty hefty databases.
Ebay has a 6.5 petabyte Greenplum warehouse and a 2.5 petabyte Teradata warehouse. This system ingests hundreds of billions of new rows of [...]

Cloudera and Yahoo partner for Apache Pig training

25th April 2009 by No Comments

Cloudera’s online Hadoop training videos now include two sessions for Apache Pig thanks to some help from Alan Gates at Yahoo.
Introduction to Pig
Pig Tutorial

Hadoop project announces Pig 0.20 release

15th April 2009 by No Comments

Pig is an open source platform for analyzing large data sets that works in conjunction with Hadoop clusters and Map-Reduce jobs. They recently announced their 0.20 release featuring a 5X performance gain over the previous version. Check out the details.

Hadoop and Hive slides

8th April 2009 by No Comments

Here is a recent Hadoop and Hive presentation from Joydeep Sen Sarma from Facebook delivered at IIT Delhi.

Amazon now offers Elastic MapReduce

2nd April 2009 by No Comments

Amazon has announced the public beta of their hosted Hadoop framework. Using Elastic MapReduce, you can quickly launch as much processing power as needed for your analytics task. Data can be stored on the S3 platform. Sign in to the AWS Management Console to kick things off.