title

section

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

On Computationally Efficient Multi-Class Calibration

Original Papers

Consider a multi-class labelling problem, where the labels can take values in $[k]$, and a predictor predicts a distribution over the labels. In this work, we study the following foundational question: \emph{Are there notions of multi-class calibration that give strong guarantees of meaningful predictions and can be achieved in time and sample complexities polynomial in $k$?} Prior notions of calibration exhibit a tradeoff between computational efficiency and expressivity: they either suffer from having sample complexity exponential in $k$, or needing to solve computationally intractable problems, or give rather weak guarantees. Our main contribution is a notion of calibration that achieves all these desiderata: we formulate a robust notion of \emph{projected smooth calibration} for multi-class predictions, and give new recalibration algorithms for efficiently calibrating predictors under this definition with complexity polynomial in $k$. Projected smooth calibration gives strong guarantees for all downstream decision makers who want to use the predictor for binary classification problems of the form: does the label belong to a subset $T \subseteq [k]$: \emph{e.g. is this an image of an animal?} It ensures that the probabilities predicted by summing the probabilities assigned to labels in $T$ are close to some perfectly calibrated binary predictor for that task. We also show that natural strengthenings of our definition are computationally hard to achieve: they run into information theoretic barriers or computational intractability. Underlying both our upper and lower bounds is a tight connection that we prove between multi-class calibration and the well-studied problem of agnostic learning in the (standard) binary prediction setting. This allows us to use kernel methods to design efficient algorithms, and also to use known hardness results for agnostic learning based on the hardness of refuting random CSPs to show lower bounds.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

gopalan24a

0

On Computationally Efficient Multi-Class Calibration

1983

2026

1983-2026

1983

false

Gopalan, Parikshit and Hu, Lunjia and Rothblum, Guy N.

given	family
Parikshit	Gopalan

given	family
Lunjia	Hu

given	family
Guy N.	Rothblum

2024-06-30

Proceedings of Thirty Seventh Conference on Learning Theory

247

inproceedings

date-parts

2024

6

30

https://proceedings.mlr.press/v247/gopalan24a/gopalan24a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2024-06-30-gopalan24a.md

2024-06-30-gopalan24a.md

Files

2024-06-30-gopalan24a.md

Latest commit

History

2024-06-30-gopalan24a.md

File metadata and controls