Theory of Machine Learning

Both PAC-Bayesian and Sample Compress learning frameworks have been shown instrumental for deriving tight (non-vacuous) generalization …

International Conference on Machine Learning (ICML), 2025.

We analyze the convergence of a novel policy gradient algorithm (referred to as SPMA) for multi-armed bandits and tabular Markov …

International Conference on Artificial Intelligence and Statistics (AISTATS), 2025.

The sample compression theory provides generalization guarantees for predictors that can be fully defined using a subset of the …

International Conference on Artificial Intelligence and Statistics (AISTATS), 2025.

Training large language models (LLMs) for pretraining or adapting to new tasks and domains has become increasingly critical as their …

Issam H. Laradji, Amrutha Ramesh, Mark Schmidt

Workshop at the Neural Information Processing Systems (NeurIPS), 2024.

We analyze the convergence of a novel policy gradient algorithm (referred to as SPMA) for multi-armed bandits and tabular Markov …

Issam H. Laradji, Reza Asad, Sharan Vaswani

Workshop at the Neural Information Processing Systems (NeurIPS), 2024.

Reconstruction functions are pivotal in sample compression theory, a framework for deriving tight generalization bounds. From a small …

Workshop at the Neural Information Processing Systems (NeurIPS), 2024.

The sample compression theory provides generalization guarantees for predictors that can be fully defined using a subset of the …

Workshop at the Neural Information Processing Systems (NeurIPS), 2024.

Early Exiting (EE) is a promising technique for speeding up inference at the cost of limited performance loss. It adaptively allocates …

Workshop at the International Conference of Machine Learning (ICML), 2024.

In statistical learning theory, a generalization bound usually involves a complexity measure imposed by the considered theoretical …

International Conference on Artificial Intelligence and Statistics (AISTATS), 2024.

Optimizing large memory-intensive neural networks requires distributing its layers across multiple GPUs (referred to as model …

Workshop at the Neural Information Processing Systems (NeurIPS), 2023.