Abstract
A mature market for skin artificial intelligence (AI) will require a standardized data pipeline to minimize biases, and the ability to measure AI performance in a competition. We report on the first phase of a UK Research and Innovation (UKRI)-sponsored competition using AI for the detection of skin cancer. The primary aim was to evaluate the effectiveness of AI using standardized, but real-world, clinical data for detection of suspicious skin lesions referred from primary care for triage. Standardized macroscopic and dermoscopic images of skin lesions, along with pre-agreed uniformly collected clinical metadata, were stored in an internationally recognized standardized format (DICOM) with Caldicott approval. These data were derived from an established National Health Service triage system, with consultant dermatologists double labelling, and pathology confirmation when indicated. The AI classification models developed are intended to integrate seamlessly into existing teledermatology workflows, aiming to optimize clinical workflows, facilitate timely medical intervention, and be highly inclusive. The first phase of the competition, funded by the UKRI and using small volumes of data, demonstrated the use of DICOM-formatted real-world data to develop, train and evaluate novel algorithms. Phase 2 of the competition, now underway, will develop and test prototypes with 5000–10 000 cases. Having succeeded in the competition selection process, we developed image classifiers pretrained using open-source dermatology datasets comprising macroscopic and dermoscopic images. Fine tuning of these classifiers was subsequently performed using competition data. A late fusion approach was implemented to fully utilize DICOM data, integrating predictions obtained from images with standardized metadata using logistic regression; each input feature contributed to the overall prediction. Additionally, a novel midlevel fusion methodology was proposed to combine image modalities and metadata effectively. Receiver operating characteristic curves were used to identify optimal operating points aligned with predefined sensitivity levels. Models using late fusion to integrate metadata with fine-tuned image classifier predictions achieved an estimated sensitivity of 93% at a specificity of 70%. Model coefficients highlighted that the most important inputs were dermoscopic images, macroscopic images (which contributed approximately equally) and patient age. A classifier trained solely on dermoscopic images achieved 97% sensitivity and 52% specificity, with approximately half of benign lesions effectively triaged. This competition demonstrated the use of real-world standardized DICOM data to prototype skin AI. However, this phase also underscored the importance of large, well-labelled and diverse datasets to enhance generalizability across populations, and the importance of using transparent competitions to compare AI products.
| Original language | English |
|---|---|
| Article number | ljaf085.029 |
| Pages (from-to) | i16-i17 |
| Journal | British Journal of Dermatology |
| Volume | 193 |
| Issue number | Supplement_1 |
| DOIs | |
| Publication status | Published - 27 Jun 2025 |
| Event | 105th Annual Meeting of the British Association of Dermatologists - Glasgow SEC, Glasgow, United Kingdom Duration: 1 Jul 2025 → 3 Jul 2025 Conference number: 105 https://badannualmeeting.co.uk/ |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Fingerprint
Dive into the research topics of 'P001 Advancing skin cancer detection with artificial intelligence: insights from a UK Research and Innovation-sponsored competition using real-world clinical data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver