Computer Vision

GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks

While numerous recent benchmarks focus on evaluating generic Vision-Language Models (VLMs), they fall short in addressing the unique …

Muhammad Sohail Danish, Muhammad Akhtar Munir, Syed Roshaan Ali Shah, Kartik Kuckreja, Fahad Shahbaz Khan, Paolo Fraccaro, Alexandre Lacoste, Salman Khan

International Conference on Computer Vision (ICCV), 2025.

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Multimodal AI has the potential to significantly enhance document-understanding tasks, such as processing receipts, understanding …

Juan A. Rodriguez, Xiangru Jian, Siba Smarak Panigrahi, Tianyu Zhang, Aarash Feizi, Abhay Puri, Akshay Kalkunte, Francois Savard, Ahmed Masry, Shravan Nayak, Rabiul Awal, Mahsa Massoud, Amirhossein Abaskohi, Zichao Li, Suyuchen Wang, Pierre-André Noël, Mats L. Richter, Saverio Vadacchino, Shubham Agarwal, Sanket Biswas, Sara Shanian, Ying Zhang, Sathwik Tejaswi Madhusudhan, João Monteiro, Krishnamurthy (Dj) Dvijotham, Torsten Scholak, Nicolas Chapados, Sepideh Kharaghani, Sean Hughes, Tamer Özsu, Siva Reddy, Marco Pedersoli, Yoshua Bengio, Christopher Pal, Issam H. Laradji, Spandana Gella, Perouz Taslakian, David Vazquez, Sai Rajeswar Mudumba

International Conference of Learning Representations (ICLR), 2025.

Deep Learning in Ultrasound localization Microscopy: Applications and perspectives

Ultrasound Localization Microscopy (ULM) is a novel super-resolution imaging technique that can image the vasculature in vivo at depth …

Brice Rauby, Paul Xing, Maxime Gasse, Jean Provost

IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control (IEEE TUFFC), 2025.

Pruning Sparse Tensor Neural Networks Enables Deep Learning for 3D Ultrasound Localization Microscopy

Ultrasound Localization Microscopy (ULM) is a non-invasive technique that allows for the imaging of micro-vessels in vivo, at depth and …

Brice Rauby, Paul Xing, Jonathan Porée, Maxime Gasse, Jean Provost

IEEE Transactions on Image Processing (IEEE TIP), 2025.

Few-shot Learning for Sign Language Recognition with Embedding Propagation

https://nafath.mada.org.qa/nafath-article/mcn2704/

Amjad Alsulami,, KHAWLAH BAJBAA, Hamzah Luqman, Issam H. Laradji

Nafath, 2024.

VCR: Visual Caption Restoration

We introduce Visual Caption Restoration (VCR), a novel vision-language task that challenges models to accurately restore partially …

Tianyu Zhang, Suyuchen Wang, Lu Li, Ge Zhang, Perouz Taslakian, Sai Rajeswar Mudumba, Jie Fu, Bang Liu, Yoshua Bengio

Workshop at the Neural Information Processing Systems (NeurIPS), 2024.

FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering

Multimodal multihop question answering is a complex task that requires reasoning over multiple sources of information, such as images …

Issam H. Laradji, Amirhossein Abaskohi, Giuseppe Carenini, Spandana Gella

ArXiv, 2024.

Pruning Sparse Tensor Neural Networks Enables Deep Learning for 3D Ultrasound Localization Microscopy

Ultrasound Localization Microscopy (ULM) is a non-invasive technique that allows for the imaging of micro-vessels in vivo, at depth and …

Brice Rauby, Paul Xing, Jonathan Porée, Maxime Gasse, Jean Provost

ArXiv, 2024.

StarVector: Generating Scalable Vector Graphics Code from Images and Text

Scalable Vector Graphics (SVGs) have become integral in modern image rendering and graphic design applications due to their infinite …

Juan A. Rodriguez, Shubham Agarwal, Abhay Puri, Issam H. Laradji, Sai Rajeswar Mudumba, Pau Rodriguez, David Vazquez, Christopher Pal, Marco Pedersoli

ArXiv, 2024.

GEO-Bench: Toward Foundation Models for Earth Monitoring

Recent progress in self-supervision shows that pre-training large neural networks on vast amounts of unsupervised data can lead to …

Alexandre Lacoste, Nils Lehmann, Hannah Kerner, Hamed Alemohammad, Björn Lütjens, Jeremy Irvin, David Dao, Pau Rodriguez, Alexandre Drouin, David Vazquez, Evan D. Sherwin

NeurIPS Datasets and Benchmarks Track (NeurIPS Datasets), 2023.