ServiceNow Research

Computer Vision

MAPL: Parameter-Efficient Adaptation of Unimodal Pre-Trained Models for Vision-Language Few-Shot Prompting
Large pre-trained models have proved to be remarkable zero- and (prompt-based) few-shot learners in unimodal vision and language tasks. …
OCR-VQGAN: Taming Text-within-Image Generation
Synthetic image generation has recently experienced significant improvements in domains such as natural image or art generation. …
Haptics-based Curiosity for Sparse-reward Tasks
Robots in many real-world settings have access to force/torque sensors in their gripper and tactile sensing is often necessary in tasks …
Consistency-CAM: Towards Improved Weakly Supervised Semantic Segmentation
Semantic segmentation is a popular task that has piqued the interest of many industries and research communities. However, acquiring …
A Planning based Neural-Symbolic Approach for Embodied Instruction Following
The ALFRED environment features an embodied agent following instructions and accomplishing tasks in simulated home environments. …
RaVAEn: Unsupervised Change Detection of Extreme Events Using ML On-Board Satellites
Applications such as disaster management enormously benefit from rapid availability of satellite observations. Traditionally, data …
A Planning based Neural-Symbolic Approach for Embodied Instruction Following
The ALFRED environment features embodied instruction following tasks in simulated home environments. However, end-to-end deep learning …
Multi-label Iterated Learning for Image Classification with Label Ambiguity
Transfer learning from large-scale pre-trained models has become essential for many computer vision tasks. Recent studies have shown …
Neural Point Light Fields
We introduce Neural Point Light Fields that represent scenes implicitly with a light field living on a sparse point cloud. Combining …