VQA

by Yash Goyal,Tejas Khot,Douglas Summers-Stay,Dhruv Batra,Devi ParikhUnknown

VQA

VQA is a new dataset containing open-ended questions about images. These questions require an understanding of vision, language and commonsense knowledge to answer. 265,016 images (COCO and abstract scenes) At least 3 questions (5.4 questions on average) per image 10 ground truth answers per question 3 plausible (but likely incorrect) answers per question Automatic evaluation metric

Dataset Attributes

Label SVG
TasksVisual Reasoning
Label SVG
CategoriesQuestions, Answers, People, Objects
Label SVG
SensorRGB Camera