Abstract
Facial landmark annotation in orthodontics and orthognathic surgery is crucial for accurate diagnosis, treatment planning, and outcome evaluation. Manual landmarking has been the gold standard for a long time, but is considered to be a time-consuming, labour-intensive, and subjective process with potential inaccuracies and inter-rater variations. Automatic landmarking tools can increase the efficiency and accuracy of landmark detection but have had some challenges in clinical settings that limit their use in the past. This study aims to develop and evaluate an accurate automatic landmark detection model using Convolutional Neural Networks (CNN) for clinical 3D facial images.The thesis comprises a systematic review and an experimental section. The systematic review provides a comprehensive analysis of existing literature on automatic 3D facial landmark detection, identifying the state-of-the-art techniques, and critically evaluating the accuracy of the models and methods currently used in clinical and biological settings. The review reveals promising results achieved through deep learning algorithms. However, it also identifies notable methodological deficiencies in the majority of the included studies, along with various concerns related to risk of bias and applicability. These should be taken into careful consideration before accepting or implementing the results.
One significant methodological weakness observed in the included studies was the limited and non-representative nature of the datasets. Other concerns include inadequate study design, particularly regarding the implementation of reference standards. These findings highlight the necessity to confirm the generalisability and reliability of these models. This can be achieved by using images that adequately represent the clinical populations and by conducting well-structured studies that minimise bias. The latter should provide comprehensive reporting, including reference standards. In conclusion, the systematic review shows that while automated techniques have seen some advancements, manual landmarking remains the gold standard for accuracy.
The experimental part addresses the limitations of the existing models by collecting high-quality clinical data, implementing quality control measures, and employing a state-of-the-art CNN model for automatic localisation of 37 facial landmarks on 3D facial images of patients attending the orthognathic clinic at the University of Glasgow. This phase of the study was conducted in three stages.
The first stage was dataset construction which involved collecting and constructing a high-quality ground truth dataset of facial landmarks, which is crucial for developing CNN landmarking networks. A total of 408 consecutive 3D facial images from adult patients attending the orthognathic clinic at the University of Glasgow were selected, and an experienced operator annotated 37 landmarks. The reproducibility of intra-operator manual landmarking was calculated for each landmark along x-,y- and z- axes and Euclidean distance.
The results of the intra-operator reproducibility analysis demonstrated that the majority of coordinates for the 37 landmarks in the x-, y-, and z- axes exhibited mean absolute error of less than 1 mm, which is considered clinically acceptable. These findings are valuable for assessing the reliability and accuracy of the examiner technique, ensuring the validity of the study conclusions.
The second stage of the study involved constructing and training an automated landmarking model. To detect landmarks on 3D facial models, we employed a patch-based convolutional neural network (CNN). Rather than using the entire facial image, specific patches around individual landmarks were extracted and utilised as CNN input, enabling the network to concentrate on important features and reducing computational complexity. We utilised 2.5D patches with texture and depth data to simplify landmark detection. This efficient representation enabled the network to focus on key 3D facial structure features, thereby improving landmark detection accuracy compared to traditional 2D methods. The resulting 2.5D representation enhances performance by eliminating irrelevant image details that could potentially distract the network.
To ensure the accuracy of the CNN model in detecting facial landmarks, access to extensive training datasets was crucial. Therefore, the dataset was enlarged by generating smaller patches cropped from the extracted patches. This enlargement resulted in a total of 10,200 PNG images, each sized 151x151 pixels, derived from the original 408 patches for each landmark. The training set comprised 80% of the images, while 10% were allocated for validation and 10% for testing of the automated model. The training and validation sets were employed during the model development process, while the test set was reserved for the final evaluation of the model's performance.
The final phase of this study involved evaluating the developed model. The accuracy of the proposed approach, based on convolutional neural networks (CNN), surpassed the existing automated models by achieving an overall localisation error of 0.83 mm ± 0.49 mm. Moreover, it achieved success detection rate of 72%, 88%, and 94% for errors within the 1 mm, 1.5 mm, and 2 mm ranges, respectively. Our method for automatic landmarking exhibited superior study design and performance compared to previously published methods. We employed a substantial training and testing dataset and incorporated a greater number of landmarks, enabling a more comprehensive assessment of the method performance. These results highlight the potential of the automatic landmark detection tool to deliver efficient and accurate outcomes in clinical settings, potentially enhancing the quality of care and treatment outcomes when incorporated into the initial planning phase and post-treatment evaluation in orthodontics and orthognathic surgery. Further research is necessary to enhance automated landmarking models and establish them as a viable alternative to manual landmarking.
Date of Award | 2023 |
---|---|
Original language | English |
Sponsors | Qatar Foundation |
Supervisor | Peter Mossey (Supervisor) |