Contact
Algerie - Plateforme academique nationale
...

IA & Recherche · arXiv AI · publications

$ECUAS_n$: A family of metrics for principled evaluation of uncertainty-augmented systems

Résumé DzCademia

Cette page structure un contenu IA & recherche pour faciliter la lecture, la citation et la vérification par les chercheurs, étudiants et moteurs IA.

arXiv:2605.20490v2 Announce Type: new Abstract: In high-stakes automated decision-making, access to predictive uncertainty is essential for enabling users -- human or downstream systems -- to accept or reject predictions based on application-specific cost trade-offs. Such uncertainty-augmented (UA) systems -- i.e., systems that output both predictions and uncertainty scores -- are currently being assessed in the literature in a variety of ways, using separate metrics to evaluate the predictions and the uncertainty scores, setting a cost function with a fixed rejection cost or integrating over a coverage-risk curve. We argue that these evaluation approaches are inadequate for assessing overall performance of the UA system for decision making under uncertainty and propose a novel family of metrics, $ECUAS_n$, formulated as proper scoring rules for the task of interest. The parameter $n$ controls the trade-off between the cost of incorrect predictions and imperfect uncertainties depending on the needs of the use-case. We demonstrate the advantages of the $ECUAS_n$ metrics both theoretically and empirically, through experiments on diverse classification and generation datasets, including a manually annotated subset of TriviaQA.
intelligence artificielle
Voir la source originale

Source officielle ou originale : arXiv AI. Vérifiez toujours les détails sur la source primaire.

Retour IA & Recherche
329
Manifestations recensees
273640
Visites cumulees
8909
Visites aujourd'hui
👥 Mon reseau