This paper was accepted on the Workshop on Distribution-Free Uncertainty Quantification at ICML 2022.
Calibration is a basic property of a great predictive mannequin: it requires that the mannequin predicts accurately in proportion to its confidence. Fashionable neural networks, nonetheless, present no robust ensures on their calibration— and could be both poorly calibrated or well-calibrated relying on the setting. It’s at present unclear which components contribute to good calibration (structure, knowledge augmentation, overparameterization, and so forth), although varied claims exist within the literature. We suggest a scientific option to research the calibration error: by decomposing it into (1) calibration error on the prepare set, and (2) the calibration generalization hole. This mirrors the basic decomposition of generalization. We then examine every of those phrases, and provides empirical proof that (1) DNNs are usually all the time calibrated on their prepare set, and (2) the calibration generalization hole is upper-bounded by the usual generalization hole. Taken collectively, this means that fashions with small generalization hole (|Check Error – Prepare Error|) are well-calibrated. This angle unifies many leads to the literature, and means that interventions which cut back the generalization hole (akin to including knowledge, utilizing heavy augmentation, or smaller mannequin measurement) additionally enhance calibration. We thus hope our preliminary research lays the groundwork for a extra systematic and complete understanding of the relation between calibration, generalization, and optimization.