Finding Value in Social Media Data
Everyone is looking for new techniques that will help them leverage Big Data. One of FICO’s scientists, Xing Zhao, recently married a popular collaborative filtering algorithm, matrix factorization, with our powerful segmented ensemble models, in FICO Model Builder. The result was strong enough to earn him a top 5 finish in the 2012 Knowledge Discovery and Data Mining competition (KDD Cup).
This year’s competition attracted over 900 teams of high caliber scientists, academics and students, as well as renowned industry experts. Placing in the top 5 is admirable. What’s even more impressive is that Xing did it as a solo effort, in his own spare time, while competing against highly focused academic teams.
KDD 2012 is a top forum for presenting and discussing academic and industry advancements in data mining, data science, Big Data and predictive analytics. This year’s KDD Cup competition challenged participants to create a recommendation system to predict whether a user would follow another user in Tencent Weibo, a Chinese micro-blogging site. Tencent provided the largest and most complex data set ever made available in a KDD Cup contest.
“The Segmented Ensemble module in Model Builder was vital to my solution, as there were many interactions between the variables and it was important to capture those interactions to make the model more predictive”, said Xing Zhao. In addition, he says, “This challenge was a great way to show that Model Builder works well with very large data sets and that scorecard ensembles can achieve great results quickly.”
Congratulations Xing for tackling such a big problem, and showing that scorecards can be great for more than just credit risk.