The authors of a machine learning research paper have been awarded a prize for their work - ten years after the initial paper had been rejected.
"Perhaps a moral to this story for young researchers is not to take rejection to heart."
—Professor Zoubin Ghahramani
The 2013 Classic Paper Prize at the International Conference on Machine Learning (ICML) was won by Zoubin Ghahramani and co-authors Xiaojin Zhu and John Lafferty for their 2003 paper "Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions". The Classic Paper Prize is given to the paper published at ICML 10 years previously which has had the most impact on the field.
Zoubin Ghahramani, Professor of Information Engineering in the Department of Engineering, said: "I'm delighted that our paper won this award. Interestingly, this paper was a revised version of a paper that had been rejected from another major conference, the Neural Information Processing Systems Conference (NIPS).
"Perhaps a moral to this story for young researchers is not to take rejection to heart. I am chairing NIPS this year, so when I send out about 1000 rejections in a couple of months I will be wondering how many of those rejected papers contain ideas which in 10 years will have, or could have had, a major impact on the field!"
This 2003 paper, which has now been cited over 1400 times, developed a simple and highly-scalable graph-based method for semi-supervised classification, and related it to harmonic functions, random walks, electric networks, and spectral graph theory. Graph-based semi-supervised learning has now become a standard approach for combining labelled and unlabelled data in many application domains. Semi-supervised learning refers to the problem of combining small amounts of labelled data (i.e. supervised learning) with large amounts of unlabelled data (i.e. unsupervised learning).
Zoubin added: "The web has made available vast amounts of unlabelled text, images, videos, music and other kinds of data, and many fields of science now collect and share vast amounts of scientific data, but obtaining high-quality labels or annotations is still difficult. This is exactly the scenario where semi-supervised learning is particularly valuable: when obtaining labelled data is expensive or time-consuming, but unlabelled data is cheap and plentiful."
Link to the 2003 paper.Link to the ICML conference. Link to the Machine Learning group website.