Oct 06

Connected Action has a cool set of visuals made with network graph tool, NodeXL. They used the twitter user information for the recent attendees at the WikiSym 2011 Wiki conference and graphed out their interconnections.

Here’s a sample from NodeXL:

NodeXL network graph

Tagged with:
Oct 04

Oracle Exadata database machine Oracle has just announced their Big Data Appliance in this press release. There’s some NoSQL stuff in there related to Oracle’s ownership of Berkeley DB. There’s some Hadoop stuff in there. And there’s some R stuff too.

Check out the post on the O’Reilly radar blog for more info.

 

 

 

Tagged with:
Oct 01

A few former Google employees have launched a new startup called Zillabyte to help users analyze lots of data. Their idea was to give other people access to the type of tools they had as Google employees. You’ll have to sign up on their site to get an invite if you want to see what they’re up to. One of the co-founders, Peter Harrington, is the author of Machine Learning in Action.

Here’s an article from Techcrunch.

Tagged with:
Sep 29

The free web analytics system from Google, Google Analytics, has gotten a few updates. They now have a real-time dashboard so you can see what is happening on your site right now. And for Google Analytics users who need more than what’s been available for free, Google now offers a premium tier with phone support, SLAs, and increased data. Read the Techcrunch article or VentureBeat article.

Sep 24

The team at Revolution Analytics has put together a set of instructions for how to use data from a spreadsheet in Google Docs with R.  Since Google Docs is a good way to share data sets, and R is a great open source tool, this makes for an all-around good solution.

Tagged with:
Sep 23

Zapaday is building a tool that scans over 4,000 sources to identify context indicating some future event. They then organize this information into calendar format. Check out their video:

And if that’s of interest to you, you will also want to check out Recorded Future.

Sep 23

A recent Business Week article, “Getting a Handle on Big Data with Hadoop“, highlights a number of companies (Walmart, Amazon, Walt Disney, General Electric, Nokia, and Bank of America) using Hadoop to solve some large scale problems. Cloudera has broken down the problems into two main areas:

  1. data processing – Hadoop’s original use case. Log file processing, etc.
  2. advanced analytics – social network analysis, target marketing, content optimization, etc.

Tagged with:
Sep 22

DataStax has just announced an $11M investment from Crosslink Capital and Lightspeed Ventures. DataStax makes products to use with NoSQL database, Cassandra. If you’re wondering what NoSQL is, check out NoSQL-Database.org. Check out the VentureBeat article for more about the investment.

Tagged with:
Sep 22

Square, the company that offers a hardware plugin for collecting credit card payments from your Iphone or Ipad, has just released a new tool for displaying time series. They call it Cube.

Check out FlowingData for more info.

Tagged with:
Sep 21

For small and medium businesses that can’t afford their own infrastructure, Amazon EC2 is very appealing.  In a matter of minutes you can start up as few or as many instances as are required for your job and then turn them off a few hours later when your job is complete.  But what if you have a really big job?

ARS Technica has a story about a 30,000 core cluster on Amazon EC2 built by a company called Cycle Computing for a Big Pharma customer.  It ran for 7 hours at a cost of $1,279 per hour.

It may be that even bigger clusters have been run on Amazon EC2…

Tagged with:
preload preload preload