ServiceNow Research

Constraining Low-level Representations to Define Effective Confidence Scores

Abstract

Neural networks are known to fail with high confidence, especially for data that somehow differs from the training distribution. Such an unsafe behaviour limits their applicability. To counter that, we show that models offering accurate confidence levels can be defined via adding constraints in their internal representations. That is, we encode class labels as fixed unique binary vectors, or class codes, and use those to enforce class-dependent activation patterns throughout the model’s depth. Resulting predictors are dubbed total activation classifiers (TAC), and TAC is used as an additional component to a base classifier to indicate how reliable a prediction is. Empirically, we show that the resemblance between activation patterns and their corresponding codes results in an inexpensive unsupervised approach for inducing discriminative confidence scores. Namely, we show that TAC is at least as good as state-of-the-art confidence scores extracted from existing models, while strictly improving the model’s value on the rejection setting.

Publication
Workshop at the Neural Information Processing Systems (NeurIPS)
João Monteiro
João Monteiro
Research Scientist

Research Scientist at Low Data Learning located at London, UK.

Pierre-André Noël
Pierre-André Noël
Applied Research Scientist

Applied Research Scientist at Low Data Learning located at Montreal, QC, Canada.

Issam H. Laradji
Issam H. Laradji
Research Scientist

Research Scientist at Low Data Learning located at Vancouver, BC, Canada.

David Vazquez
David Vazquez
Manager of Research Programs

Manager of Research Programs at Research Management located at Montreal, QC, Canada.