TY - JOUR
T1 - On Clinical Agreement on the Visibility and Extent of Anatomical Layers in Digital Gonio Photographs
AU - Peroni, Andrea
AU - Paviotti, Anna
AU - Campigotto, Mauro
AU - Pinto, Luis A.
AU - Cutolo, Carlo A.
AU - Shi, Yue
AU - Cobb, Caroline
AU - Gong, Jacintha
AU - Patel, Sirjhun
AU - Gillan, Stewart
AU - Tatham, Andrew J.
AU - Trucco, Emanuele
N1 - This work is fully funded by NIDEK Technologies Srl., Albignasego, Italy.
PY - 2021/9/1
Y1 - 2021/9/1
N2 - Purpose: To quantitatively evaluate the inter-annotator variability of clinicians tracing the contours of anatomical layers of the iridocorneal angle on digital gonio photographs, thus providing a baseline for the validation of automated analysis algorithms.Methods: Using a software annotation tool on a common set of 20 images, five experienced ophthalmologists highlighted the contours of five anatomical layers of interest: iris root (IR), ciliary body band (CBB), scleral spur (SS), trabecular meshwork (TM), and cornea (C). Inter-annotator variability was assessed by (1) comparing the number of times ophthalmologists delineated each layer in the dataset; (2) quantifying how the consensus area for each layer (i.e., the intersection area of observers’ delineations) varied with the consensus threshold; and (3) calculating agreement among annotators using average per-layer precision, sensitivity, and Dice score.Results: The SS showed the largest difference in annotation frequency (31%) and the minimum overall agreement in terms of consensus size (∼28% of the labeled pixels). The average annotator’s per-layer statistics showed consistent patterns, with lower agreement on the CBB and SS (average Dice score ranges of 0.61–0.7 and 0.73–0.78, respectively) and better agreement on the IR, TM, and C (average Dice score ranges of 0.97–0.98, 0.84–0.9, and 0.93–0.96, respectively).Conclusions: There was considerable inter-annotator variation in identifying contours of some anatomical layers in digital gonio photographs. Our pilot indicates that agreement was best on IR, TM, and C but poorer for CBB and SS.Translational Relevance: This study provides a comprehensive description of inter-annotator agreement on digital gonio photographs segmentation as a baseline for validating deep learning models for automated gonioscopy.
AB - Purpose: To quantitatively evaluate the inter-annotator variability of clinicians tracing the contours of anatomical layers of the iridocorneal angle on digital gonio photographs, thus providing a baseline for the validation of automated analysis algorithms.Methods: Using a software annotation tool on a common set of 20 images, five experienced ophthalmologists highlighted the contours of five anatomical layers of interest: iris root (IR), ciliary body band (CBB), scleral spur (SS), trabecular meshwork (TM), and cornea (C). Inter-annotator variability was assessed by (1) comparing the number of times ophthalmologists delineated each layer in the dataset; (2) quantifying how the consensus area for each layer (i.e., the intersection area of observers’ delineations) varied with the consensus threshold; and (3) calculating agreement among annotators using average per-layer precision, sensitivity, and Dice score.Results: The SS showed the largest difference in annotation frequency (31%) and the minimum overall agreement in terms of consensus size (∼28% of the labeled pixels). The average annotator’s per-layer statistics showed consistent patterns, with lower agreement on the CBB and SS (average Dice score ranges of 0.61–0.7 and 0.73–0.78, respectively) and better agreement on the IR, TM, and C (average Dice score ranges of 0.97–0.98, 0.84–0.9, and 0.93–0.96, respectively).Conclusions: There was considerable inter-annotator variation in identifying contours of some anatomical layers in digital gonio photographs. Our pilot indicates that agreement was best on IR, TM, and C but poorer for CBB and SS.Translational Relevance: This study provides a comprehensive description of inter-annotator agreement on digital gonio photographs segmentation as a baseline for validating deep learning models for automated gonioscopy.
KW - inter-annotator variability
KW - automated gonioscopy
KW - AI software validation
KW - Inter-annotator variability
KW - Automated gonioscopy
UR - http://www.scopus.com/inward/record.url?scp=85114726771&partnerID=8YFLogxK
U2 - 10.1167/tvst.10.11.1
DO - 10.1167/tvst.10.11.1
M3 - Article
C2 - 34468695
SN - 2164-2591
VL - 10
JO - Translational Vision Science and Technology
JF - Translational Vision Science and Technology
IS - 11
M1 - 1
ER -