TextVQA

by Amanpreet Singh,Vivek Natarajan,Meet Shah,Yu Jiang,Xinlei Chen,Dhruv Batra,Devi Parikh,Marcus RohrbachCC-BY

TextVQA

TextVQA requires models to read and reason about text in images to answer questions about them. Specifically, models need to incorporate a new modality of text present in the images and reason over it to answer TextVQA questions. Statistics 28,408 images from OpenImages 45,336 questions 453,360 ground truth answers

Dataset Attributes

Label SVG
TasksText Detection, Event Detection
Label SVG
CategoriesOptical Character Recognition, Text, Comprehension
Label SVG
SensorRGB Camera