ServiceNow recherche

Constraining Low-level Representations to Define Effective Confidence Scores

Résumé

Neural networks are known to fail with high confidence, especially for data that somehow differs from the training distribution. Such an unsafe behaviour limits their applicability. To counter that, we show that models offering accurate confidence levels can be defined via adding constraints in their internal representations. That is, we encode class labels as fixed unique binary vectors, or class codes, and use those to enforce class-dependent activation patterns throughout the model’s depth. Resulting predictors are dubbed total activation classifiers (TAC), and TAC is used as an additional component to a base classifier to indicate how reliable a prediction is. Empirically, we show that the resemblance between activation patterns and their corresponding codes results in an inexpensive unsupervised approach for inducing discriminative confidence scores. Namely, we show that TAC is at least as good as state-of-the-art confidence scores extracted from existing models, while strictly improving the model’s value on the rejection setting.

Publication
Workshop at the Neural Information Processing Systems (NeurIPS)
Pierre-André Noël
Pierre-André Noël
Research Scientist

Research Scientist at AI Frontier Research located at Montreal, QC, Canada.

Issam H. Laradji
Issam H. Laradji
Research Manager

Research Manager at AI Frontier Research located at Vancouver, BC, Canada.

David Vazquez
David Vazquez
Director of AI Research

Director of AI Research at AI Research Management located at Montreal, QC, Canada.