CapGaze

by Sen HeUnknown

CapGaze

There are two parts of the dataset: capgaze1: contains 1000 images, and raw data (eye-fixations and verbal description) from 5 native English speakers. This part of data was used for the analysis. For data privacy reason, the voice of the verbal description was converted by a masking process (pitch modulation, the content was preserved). capgaze2: contains 3000 images, and processed data (we combined all the eye-fixations from different people for each image into a fixation map). This part of data was used for developing saliency prediction model under the image captioning task.

Dataset Attributes

Label SVG
TasksVisual Reasoning
Label SVG
CategoriesSaliency, Attention, Gaze
Label SVG
SensorRGB Camera