LABEL ENTROPY WITH SIMILARITY GRAPH CLIQUE FOR ASSESSING ANNOTATION QUALITY
DOI:
https://doi.org/10.7251/ZJF2514259IKeywords:
entropy, label propagation, graph-based machine learning, annotationAbstract
We introduce simple metrics using the entropy of label distribution in local maximum cliques of similarity graph, which can assess human annotation quality for large image datasets. Since the annotation is done by a human task, it always contains potential errors hidden in a dataset due to manual fluctuation or inconsistency. This annotation error is crucial, especially in medical image multi-class classification as a label noise when we create a classifier with machine learning. In our work, we focused on how to assess the entire label quality in a large dataset. To this end, we proposed novel metrics for assessing the label quality of datasets. We also assessed existing label noise detection methodologies with our metrics and found that the transformed label propagation with label-smoothing methodology, which is first proposed here, showed a quite high accuracy among the three methodologies.