Apr 30

Who has the biggest database? Due to the increasing amount of behavioral information tracked during a web browsing session, some internet properties are starting to rack up some pretty hefty databases.

Ebay has a 6.5 petabyte Greenplum warehouse and a 2.5 petabyte Teradata warehouse. This system ingests hundreds of billions of new rows of data every day.
Facebook has a 2.5 petabyte Hadoop system
Yahoo has more than 1 petabyte running on their homemade system

Tagged with:
Apr 25

Cloudera’s online Hadoop training videos now include two sessions for Apache Pig thanks to some help from Alan Gates at Yahoo.

Introduction to Pig
Pig Tutorial

Tagged with:
preload preload preload