Reasoning

The Promise of RL for Autoregressive Image Editing

While image generation techniques are now capable of producing high quality images that respect prompts which span multiple sentences, …

Saba Ahmadi, Rabiul Awal, Ankur Sikarwar, Amirhossein Kazemnejad, Ge Ya Luo, Juan A. Rodriguez, Sai Rajeswar Mudumba, Siva Reddy, Christopher Pal, Benno Krojer, Aishwarya Agrawal

Neural Information Processing Systems (NeurIPS), 2025.

WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation

Understanding diverse web data and automating web development presents an exciting challenge for agentic AI. While existing benchmarks …

Rabiul Awal, Mahsa Massoud, Zichao Li, Aarash Feizi, Suyuchen Wang, Christopher Pal, Aishwarya Agrawal, David Vazquez, Siva Reddy, Juan A. Rodriguez, Perouz Taslakian, Sai Rajeswar Mudumba

Workshop at the Computer Vision and Pattern Recognition Conference (CVPR), 2025.

WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation

Understanding diverse web data and automating web development presents an exciting challenge for agentic AI. While existing benchmarks …

Rabiul Awal, Mahsa Massoud, Zichao Li, Aarash Feizi, Suyuchen Wang, Christopher Pal, Aishwarya Agrawal, David Vazquez, Siva Reddy, Juan A. Rodriguez, Perouz Taslakian, Spandana Gella, Sai Rajeswar Mudumba

Workshop at the International Conference of Learning Representation (ICLR), 2025.

Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning

Decoder-only Transformers often struggle with complex reasoning tasks, particularly arithmetic reasoning requiring multiple sequential …

Md Rifat Arefin, Gopeshh Subbaraj, Nicolas Gontier, Yann LeCun, Irina Rish, Ravid Shwartz-Ziv, Christopher Pal

International Conference of Learning Representations (ICLR), 2025.

Evaluating Interventional Reasoning Capabilities of Large Language Models

Numerous decision-making tasks require estimating causal effects under interventions on different parts of a system. As practitioners …

Tejas Kasetty, Divyat Mahajan, Gintare Karolina Dziugaite, Alexandre Drouin, Dhanya Sridhar

Workshop at the Neural Information Processing Systems (NeurIPS), 2024.

Are Diffusion Models Vision-And-Language Reasoners?

Text-conditioned image generation models have recently shown immense qualitative success using denoising diffusion processes. However, …

Benno Krojer, Elinor Poole-Dayan, Vikram Voleti, Christopher Pal, Siva Reddy

Conference on Neural Information Processing Systems (NeurIPS), 2023.

Egocentric Planning for Scalable Embodied Task Achievement

Embodied agents face significant challenges when tasked with performing actions in diverse environments, particularly in generalizing …

Xiaotian Liu, Hector Palacios, Christian Muise

Conference on Neural Information Processing Systems (NeurIPS), 2023.

Explaining Graph Neural Networks Using Interpretable Local Surrogates

We propose an interpretable local surrogate (ILS) method for understanding the predictions of black-box graph models. Explainability …

Perouz Taslakian, Guillaume Rabusseau, Farzaneh Heidari

Workshop at the International Conference on Machine Learning (ICML), 2023.

OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning

Imagination is a crucial aspect of human intelligence that enables us to combine concepts in novel ways and make sense of new …

Rim Assouel, Pau Rodriguez, Perouz Taslakian, David Vazquez, Yoshua Bengio

Workshop at the International Conference on Machine Learning (ICML), 2023.

Knowledge Hypergraph Embedding Meets Relational Algebra

Embedding-based methods for reasoning in knowledge hypergraphs learn a representation for each entity and relation. Current methods do …

Bahare Fatemi, Perouz Taslakian, David Vazquez, David Poole

Journal of Machine Learning Research (JMLR), 2023.