LABEL ENTROPY WITH SIMILARITY GRAPH CLIQUE FOR ASSESSING ANNOTATION QUALITY

Authors

  • Yasuhiro Iida
  • Bojan Mrazovac
  • Yasuo Ishigure

DOI:

https://doi.org/10.7251/ZJF2514259I

Keywords:

entropy, label propagation, graph-based machine learning, annotation

Abstract

We introduce simple metrics using the entropy of label distribution in local maximum cliques of similarity graph, which can assess human annotation quality for large image datasets. Since the annotation is done by a human task, it always contains potential errors hidden in a dataset due to manual fluctuation or inconsistency. This annotation error is crucial, especially in medical image multi-class classification as a label noise when we create a classifier with machine learning. In our work, we focused on how to assess the entire label quality in a large dataset. To this end, we proposed novel metrics for assessing the label quality of datasets. We also assessed existing label noise detection methodologies with our metrics and found that the transformed label propagation with label-smoothing methodology, which is first proposed here, showed a quite high accuracy among the three methodologies.

Downloads

Published

2025-11-21