ServiceNow recherche

BigDocs: A Permissively-Licensed Dataset for Training Vision-Language Models on Document and Code Tasks

Résumé

Vision and language models that can accurately understand both images and text are crucial for deeper document understanding. These models can efficiently perform enterprise-level tasks, such as receipt processing from screenshots, website and business workflow generation from sketches, and extracting information from structured documents. These tasks often require generating long, structured outputs, an area where models trained on current datasets struggle. Additionally, many existing datasets are not license-permissive, limiting their use to non-commercial applications. To address these limitations, we present BigDocs, a high-quality, specifically curated dataset to train license-permissive Vision and Language Models (VLMs) capable of performing a wide variety of tasks. This dataset focuses on acquiring accurate image-text pairs across diverse tasks while adhering to accountability, responsibility, and transparency (ART) standards. Our preliminary experiments demonstrate that pre-training with BigDocs yields performance boosts in document reasoning and tasks requiring long structured outputs such as screenshot-to-HTML, table-to-Latex, or image-to-SVG. We believe that VLMs trained on BigDocs have the potential to enhance multimodal capabilities significantly, benefiting broader research in multimodal document understanding.

Publication
Workshop at the Neural Information Processing Systems (NeurIPS)
Juan A. Rodriguez
Juan A. Rodriguez
Visiting Researcher

Visiting Researcher at AI Frontier Research located at Montreal, QC, Canada.

Tianyu Zhang
Tianyu Zhang
Visiting Researcher

Visiting Researcher at AI Frontier Research located at Montreal, QC, Canada.

Aarash Feizi
Aarash Feizi
Visiting Researcher

Visiting Researcher at AI Frontier Research located at Montreal, QC, Canada.

Abhay Puri
Abhay Puri
Applied Research Scientist

Applied Research Scientist at AI Research Deployment​ located at Montreal, QC, Canada.

Amirhossein Abaskohi
Amirhossein Abaskohi
Visiting Researcher

Visiting Researcher at AI Frontier Research located at Vancouver, BC, Canada.

Ahmed Masry
Ahmed Masry
Visiting Researcher

Visiting Researcher at AI Frontier Research located at Toronto, ON, Canada.

Shravan Nayak
Shravan Nayak
Visiting Researcher

Visiting Researcher at AI Frontier Research located at Montreal, QC, Canada.

Rabiul Awal
Rabiul Awal
Visiting Researcher

Visiting Researcher at AI Frontier Research located at Montreal, QC, Canada.

Pierre-André Noël
Pierre-André Noël
Research Scientist

Research Scientist at AI Frontier Research located at Montreal, QC, Canada.

Torsten Scholak
Torsten Scholak
Research Lead

Research Lead at AI Research Deployment​ located at Montreal, QC, Canada.

Nicolas Chapados
Nicolas Chapados
VP of Research

VP of Research at AI Research Management located at Montreal, QC, Canada.

Sean Hughes
Sean Hughes
AI Ecosystem Director

AI Ecosystem Director at AI Research Partnerships & Ecosystem​ located at San Diego, CA, USA.

Christopher Pal
Christopher Pal
Distinguished Scientist

Distinguished Scientist at AI Research Partnerships & Ecosystem​ located at Montreal, QC, Canada.

Perouz Taslakian
Perouz Taslakian
Research Lead

Research Lead at AI Frontier Research located at Montreal, QC, Canada.

David Vazquez
David Vazquez
Director of AI Research

Director of AI Research at AI Research Management located at Montreal, QC, Canada.

Issam H. Laradji
Issam H. Laradji
Research Manager

Research Manager at AI Frontier Research located at Vancouver, BC, Canada.

Spandana Gella
Spandana Gella
Research Manager

Research Manager at AI Frontier Research located at Montreal, QC, Canada.