Kaggle


Kaggle, a subsidiary of Google LLC, is an online community of data scientists and machine learning practitioners. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges.
Kaggle got its start in 2010 by offering machine learning competitions and now also offers a public data platform, a cloud-based workbench for data science, and Artificial Intelligence education. Its key personnel were Anthony Goldbloom and Jeremy Howard. Nicholas Gruen was founding chair succeeded by Max Levchin. Equity was raised in 2011 valuing the company at $25 million. On 8 March 2017, Google announced that they were acquiring Kaggle.

Kaggle community

In June 2017, Kaggle announced that it passed 1 million registered users, or Kagglers. The community spans 194 countries. It is the largest and most diverse data community in the world, ranging from those just starting out to many of the world's best known researchers.
Kaggle regularly attract over a thousand teams and individuals. Kaggle's community has thousands of and . Many of these researchers publish papers in peer-reviewed journals based on their performance in Kaggle competitions.
By March 2017, the Two Sigma Investments fund was running a competition on Kaggle to code a trading algorithm.

Kaggle's services

  1. The competition host prepares the data and a description of the problem.
  2. Participants experiment with different techniques and compete against each other to produce the best models. Work is shared publicly through Kaggle Kernels to achieve a better benchmark and to inspire new ideas. Submissions can be made through Kaggle Kernels, through manual upload or using the Kaggle API. For most competitions, submissions are scored immediately and summarized on a live leaderboard.
  3. After the deadline passes, the competition host pays the prize money in exchange for "a worldwide, perpetual, irrevocable and royalty-free license to use the winning Entry", i.e. the algorithm, software and related intellectual property developed, which is "non-exclusive unless otherwise specified".
Alongside its public competitions, Kaggle also offers private competitions limited to Kaggle's top participants. Kaggle offers a free tool for data science teachers to run academic machine learning competitions, . Kaggle also hosts recruiting competitions in which data scientists compete for a chance to interview at leading data science companies like Facebook, Winton Capital, and Walmart.

Impact of Kaggle competitions

Kaggle has run hundreds of machine learning competitions since the company was founded. Competitions have ranged from improving gesture recognition for Microsoft Kinect to improving the search for the Higgs boson at CERN.
Competitions have resulted in many successful projects including furthering the state of the art in HIV research, chess ratings and traffic forecasting. Most famously, Geoffrey Hinton and George Dahl used deep neural networks to win a competition hosted by Merck. And Vlad Mnih used deep neural networks to win a competition hosted by Adzuna. This helped show the power of deep neural networks and resulted in the technique being taken up by others in the Kaggle community. Tianqi Chen from the University of Washington also used Kaggle to show the power of XGBoost, which has since taken over from Random Forest as one of the main methods used to win Kaggle competitions.
Several academic papers have been published on the basis of findings made in Kaggle competitions. A key to this is the effect of the live leaderboard, which encourages participants to continue innovating beyond existing best practice. The winning methods are frequently written up on the Kaggle blog, .

Financials

In March 2017, Fei-Fei Li, Chief Scientist at Google, announced that Google was acquiring Kaggle during her keynote at Google Next.