Kaggle Using Competitions To Raise Benchmarks And Best Practices
November 2, 2016 News which was launched in 2010, is a Silicon Valley start-up which has used predictive modelling competitions to solve problems for a number of organisations such as NASA, Wikipedia, Ford and Deloitte.

The site kicked off with the idea to become a platform for predictive modelling and conducting analytic competitions. Kaggle gets companies or researchers to post their data on their platform, and talented data miners as well as statisticians from around the world will use it to produce their models.

Known famously for the Heritage Health Prize competition worth $3,000,000, Kaggle have already run more than 200 data science competitions since its inception. They also conduct recruitment competitions where data scientists compete to be interviewed by leading organisations in data science and big data such as Facebook, Winston Capital and Walmart.

The site caters in large parts to data scientists and also to anyone looking to host a competition, write their own code and get feedback from other experts or even explore and analyse high quality data from public datasets.

Interested parties may also try their hand at contending in the many competitions available daily to see how they rank up against the world’s best.

Kaggle’s registered users, as of May 2016 who number above 530,000, known also as Kagglers, are the largest and most diverse data community in the world spanning across 194 countries. Posts on the site number above 4,000 a month and they receive approximately 3,500 competition submissions per-day.

Many successful projects have been developed through the competitions which include furthering the state of the art in HIV research, chess ratings as well as traffic forecasting apps.

Competitions are run once hosts prepare the data and description of the problem. Kaggle will then offer their consulting service to help the host and frame the competition, anonymize data as well as integrate the model that best suites them into their operations.

Participants will then run their own experiments utilising different techniques to compete against each other to produce the best models. The finished product is shared publicly through Kaggle Scripts in the hope it helps to set a higher benchmark and inspire future models.

The submissions can either be made through Scripts or through private manual upload. Most competition submissions are scored immediately (based on their predictive accuracy relative to a hidden solution file) and summarized on a live leaderboard.

These competitions have shown a lot of promise in changing paradigms and creating new innovations. Not to mention bringing to light talented innovators. In fact, there have also been several academic papers highlighting the findings from the models produced.

The live leaderboard keeps participants encouraged to keep on innovating and improve on current best practices. Kaggle blog will write up the winning innovations at No Free Hunch for viewing.