Feb 05

Check out Mallet, a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.

Feb 04

Tagged with:
Feb 03

Here’s the Youtube playlist of the Strata Conference keynote speeches 45 videos in all.

And here’s the link to many of the slide decks.

And here’s a good recap of the StrataConf event.

Tagged with:
Feb 03

Kaggle has hosted several data mining competitions, similar to the Netflix prize, but recently announced a new and big one. It’s called the Heritage Health Prize and the prize has been set at $3M. The focus on the prize is being able to predict when a person needs to go to the hospital before they actually make a visit. Here’s some more info from O’Reilly Radar. And here is Anthony Goldbloom of Kaggle announcing the contest at the Strata Conference…


Tagged with:
Jan 29

Darren Vengroff, chief scientist at RIchRelevance, explains how he is working to make recommendation systems smarter. Check out the Fast Company article.

Sep 19

Check out the 2010 INFORMS Data Mining Contest. Participants are challenged to predict stock prices at five minute intervals. Visit the site to download the training data set. The submission deadline in October 10th, 2010.

Tagged with:
Sep 04

A question posed recently on Quora – How do I become a data scientist? has received tons of interesting and helpful feedback including some recommended steps. Also check out coverage on the topic over at O’Reilly Radar.

Tagged with:
Sep 04

The search results we see most everywhere are based on the systematic estimation of the relevance of the page. What Facebook recent patent specs out is the ability to determine and display the search results based on what a person’s friends or friends of their friends found relevant. Very interesting. Check out the Inside Facebook article for more, including what the impacts to competitive offerings like Google Me could be.

Tagged with:
Jul 31

SETI (Search for Extra-terrestrial Intelligence) is fairly well-known for their utilization of distributed computing. For decades, they have allowed the home computer user to donate time and computing resources to analyze radio signals hoping to identify signs of extra-terrestrials. This article from O’Reilly Radar details some of the changes SETI has made.

Tagged with:
Jul 31

Amazon has been recommending products based on past purchase and browse data for some time. More recently, Facebook has been suggesting users who you may want to add as friends. Now twitter is adding a suggestions for you feature based on the people you follow. A few more details from Techcrunch.

Tagged with:
preload preload preload