ServiceNow Research

Guarantees

On the role of data in PAC-Bayes bounds
The dominant term in PAC-Bayes bounds is often the Kullback–Leibler divergence between the posterior and prior. For so-called …
Sharpened Generalization Bounds based on Conditional Mutual Information and an Application to Noisy-Gradient Iterative Algorithms
The information-theoretic framework of Russo and J. Zou (2016) and Xu and Raginsky (2017) provides bounds on the generalization error …
On the Information Complexity of Proper Learners for VC Classes in the Realizable Case
We provide a negative resolution to a conjecture of Steinke and Zakynthinou (2020a), by showing that their bound on the conditional …
In Defense of Uniform Convergence: Generalization via derandomization with an application to interpolating predictors
We propose to study the generalization error of a learned predictor ĥ in terms of that of a surrogate (potentially randomized) …
Linear Mode Connectivity and the Lottery Ticket Hypothesis
We study whether a neural network optimizes to the same, linearly connected minimum under different samples of SGD noise (e.g., random …
Stabilizing the Lottery Ticket Hypothesis
Pruning is a well-established technique for removing unnecessary structure from neural networks after training to improve the …
Linear Mode Connectivity and the Lottery Ticket Hypothesis
We study whether a neural network optimizes to the same, linearly connected minimum under different samples of SGD noise (e.g., random …
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
In this work, we improve upon the stepwise analysis of noisy iterative learning algorithms initiated by Pensia, Jog, and Loh (2018) and …
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates
Recent works have shown that stochastic gradient descent (SGD) achieves the fast convergence rates of full-batch gradient descent for …