On Temperature Scaling and Conformal Prediction of Deep Classifiers

Research output: Contribution to journalConference articlepeer-review

Abstract

In many classification applications, the prediction of a deep neural network (DNN) based classifier needs to be accompanied by some confidence indication. Two popular approaches for that aim are: 1) Calibration: modifies the classifier’s softmax values such that the maximal value better estimates the correctness probability; and 2) Conformal Prediction (CP): produces a prediction set of candidate labels that contains the true label with a user-specified probability, guaranteeing marginal coverage but not, e.g., per class coverage. In practice, both types of indications are desirable, yet, so far the interplay between them has not been investigated. Focusing on the ubiquitous Temperature Scaling (TS) calibration, we start this paper with an extensive empirical study of its effect on prominent CP methods. We show that while TS calibration improves the class-conditional coverage of adaptive CP methods, surprisingly, it negatively affects their prediction set sizes. Motivated by this behavior, we explore the effect of TS on CP beyond its calibration application and reveal an intriguing trend under which it allowsto trade prediction set size and conditional coverage of adaptive CP methods. Then, we establish a mathematical theory that explains the entire non-monotonic trend. Finally, based on our experiments and theory, we offer guidelines for practitioners to effectively combine adaptive CP with calibration, aligned with user-defined goals.

Original languageEnglish
Pages (from-to)11813-11845
Number of pages33
JournalProceedings of Machine Learning Research
Volume267
StatePublished - 2025
Event42nd International Conference on Machine Learning, ICML 2025 - Vancouver, Canada
Duration: 13 Jul 202519 Jul 2025

Bibliographical note

Publisher Copyright:
© 2025 by the author(s).

Fingerprint

Dive into the research topics of 'On Temperature Scaling and Conformal Prediction of Deep Classifiers'. Together they form a unique fingerprint.

Cite this