Diversity in Faces (DiF)

by Michele Merler,Nalini Ratha,Rogerio S. Feris,John R. SmithResearch Only

Diversity in Faces (DiF)

We are familiar with how faces differ by age, gender, and skin tone, and how different faces can vary across some of these dimensions. But, as prior studies have shown, these dimensions are not adequate for characterizing the full diversity of human faces. Dimensions like face symmetry, facial contrast, the pose the face is in, the length or width of the face’s attributes (eyes, nose, forehead, etc.) are also important. For the facial recognition systems to perform as desired – and the outcomes to become increasingly accurate – training data must be diverse and offer a breadth of coverage. For example, the training datasets must be large enough and different enough that the technology learns all the ways in which faces differ to accurately recognize those differences in a variety of situations. The images must reflect the distribution of features in faces we see in the world. To help accelerate the study of diversity and coverage of data for AI facial recognition systems, IBM Research has released a large and diverse dataset called Diversity in Faces (DiF) to advance the study of fairness and accuracy in facial recognition technology. Our initial analysis has shown that the DiF dataset provides a more balanced distribution and broader coverage of facial images compared to previous datasets. Furthermore, the insights obtained from the statistical analysis of the 10 initial coding schemes on the DiF dataset has furthered our own understanding of what is important for characterizing human faces and enabled us to continue important research into ways to improve facial recognition technology. The dataset is available today to the global research community upon request. IBM is proud to make this available and our goal is to help further our collective research and contribute to creating AI systems that are more fair.

Dataset Attributes

Label SVG
TasksFacial Recognition
Label SVG
CategoriesFaces