Chair of Artificial Intelligence and Machine Learning

Breadcrumb Navigation


Calibration of scoring classifiers: Survey and empirical comparison (Ba/Ma)

Topic for a bachelor/master's thesis

Short Description:

Many (binary and multi-class) classification methods in machine learning produce predictions in the form of scores, which provide an indication of the classifier’s confidence in a certain class assignment. Yet, the interpretation of such scores in terms of probabilities is normally not legitimate, although probabilities would be desirable in many applications. Therefore, so-called calibration methods have been developed [1, 2, 3, 4]. Roughly speaking, calibration methods learn mappings from scores to probabilities. Calibration can be used as a post-processing step, and most calibration methods can be combined with any scoring classifier.

The goal of this thesis is a comparative study of existing calibration methods. This includes a survey, in which each of these methods is described, a systematic comparison according to suitable criteria, an implementation of the methods, and an empirical study, in which their performance on real data sets is evaluated.


Survey and systematic comparison of calibration methods; implementation; empirical evaluation of such methods.


Basic knowledge in machine learning; programming skills.


Prof. Eyke Hüllermeier


  • [1] John Platt. Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In A.J. Smola, P. Bartlett, B. Schoelkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 61–74, Cambridge, MA, 1999. MIT Press.
  • [2] B. Zadrozny and C. Elkan. Transforming classifier scores into accurate multiclass probability estimates. In Proc. KDD–02, 8th International Conference on Knowledge Discovery and Data Mining, pages 694–699, Edmonton, Alberta, Canada, 2002.
  • [3] M.P. Naeini, G.F. Cooper, and M. Hauskrecht. Obtaining well calibrated probabilities using Bayesian binning. In Proc. AAAI, National Conference on Artificial Intelligence, 2015.
  • [4] M. Kull, T.M. Silva Filho, and P. Flach. Beyond sigmoids: How to obtain well-calibrated probabilities from binary classifiers with beta calibration. Electronic Journal of Statistics, 11(2):5021–5080, 2017.