The BreakingNews Dataset
by Arnau RamisaResearch Only
The BreakingNews Dataset
by Arnau RamisaLicense : Research Only
To foster research on multi-modal news article analysis, we propose the BreakingNews dataset, that includes images, captions, geo-location information and comments. This dataset includes approximately 100,000 news articles from several major newspapers and media agencies, collected between the 1st of January and the 31st of December of 2014. All articles include at least one image, and cover a wide variety of topics, including sports, politics, arts, healthcare or local news. The copyright of all text and images resides with the original owners.
Dataset Attributes
TasksImage Captioning, Text Illustration, Geolocation, Source Detection, Popularity Prediction, Transfer Learning
CategoriesNews, Articles
SensorWeb sampling