CINIC-10

by Luke N. Darlow,Elliot J. Crowley,Antreas Antoniou,Amos J. StorkeyPublic Domain

CINIC-10

CINIC-10 is a drop-in replacement for CIFAR-10. We compiled it as a benchmarking datset because CIFAR-10 can be too small/too easy and ImageNet is often too large/difficult. ImageNet32 and ImageNet64 are smaller than ImageNet but more even more difficult. CINIC-10 fills this benchmarking gap. To combat the shortcomings of existing benchmarking datasets, we present CINIC-10: CINIC-10 Is Not ImageNet or CIFAR-10. It is an extension of CIFAR-10 via the addition of downsampled ImageNet images. CINIC-10 has the following desirable properties: It has 270,000 images, 4.5 times that of CIFAR. The images are the same size as in CIFAR, meaning that CINIC-10 can be used as a drop-in alternative to CIFAR-10. It has equally sized train, validation, and test splits. In some experimental setups it may be that more than one training dataset is required. Nonetheless, a fair assessment of generalisation performance is enabled through equal dataset split sizes. The train and validation subsets can be combined to make a larger training set. CINIC-10 consists of images from both CIFAR and ImageNet. The images from these are not necessarily identically distributed, presenting a new challenge: distribution shift. In other words, we can find out how well models trained on CIFAR images perform on ImageNet images for the same classes.

Dataset Attributes