2 Clarke Drive
Suite 100
Cranbury, NJ 08512
© 2024 MJH Life Sciences™ and OncLive - Clinical Oncology News, Cancer Expert Insights. All rights reserved.
An artificial intelligence-based smartphone app used by primary care physicians showed high diagnostic accuracy in detecting cutaneous melanoma.
Dermalyser, an artificial intelligence (AI)-based clinical decision support tool, displayed high diagnostic accuracy in detecting cutaneous melanoma in a primary care setting, according to findings from a prospective clinical study (NCT05172232) published in the British Journal of Dermatology.1
The tool, a smartphone app that uses AI to analyze dermoscopic images taken by a primary care physician, showed an area under the receiver operating characteristic (AUROC) curve of 0.960 (95% CI, 0.928-0.980), corresponding to a maximum sensitivity and specificity of 95.2% and 84.5%, respectively, in identifying melanomas. Additionally, in invasive melanoma only, the AUROC was 0.988 (95% CI, 0.965-0.997) with a maximum sensitivity of 100% and a maximum specificity of 92.6%. Of 253 lesions analyzed, 21 proved to be melanoma including 11 thin invasive melanomas and 10 melanomas in situ.
“The AI-based decision support tool to detect melanoma evaluated in this study appears to be clinically reliable and of potential clinical benefit in the management of skin lesions of concern assessed in primary care and can improve the identification of lesions in need of dermatological or histopathological assessment,” Panos Papachristou, MD, PhD, of the Department of Neurobiology, Care Sciences and Society, at Karolinska Institutet in Stockholm, Sweden, and coauthors wrote.
The first-in-clinic, non-interventional clinical study evaluated the safety, performance, and benefit of Dermalyzer in patients with cutaneous lesions where malignant melanoma could not be ruled out. The study included patients from 36 primary care centers across Sweden; patients had to be at least 18 years old, able to provide written consent, and attending a primary care facility with at least 1 suspicious skin lesion where malignant melanoma could not be ruled out in order to be eligible for the study.1,2
Primary care physicians recorded their degree of suspicion for melanoma as being either “high” or “low, but cannot rule out melanoma.” Then, they determined a course of action for the patient with either excision at the primary care center; referral for excision by another surgeon; referral to a dermatologist for further clinical evaluation (with or without the use of teledermoscopy); or another action that needed to be specified.1
After making their own determinations, primary care providers uploaded images taken with a smartphone connected to a dermoscope to the Dermalyzer app. The decision support system of the app was trained in silico prior to the start of the study using an extensive number of dermoscopic images of skin lesions. The app’s algorithm produces a value between 0 and 1 when applied to the dermoscopic images uploaded by the clinician, corresponding to a calculated probability of the lesion being melanoma; the determination is presented to the clinician as “evidence of melanoma detected” or “no evidence of melanoma detected” based on the calculated probability. Investigators included a cutoff level to distinguish between the 2 readings which corresponded to a sensitivity of 95% and a specificity of 78%. For each lesion, the app determination was later compared with the final clinical or histopathological tumor diagnosis from the patient’s record.
The primary objective of the study was to determine the diagnostic precision of the device and the secondary objective was to evaluate the usability of the tool.2
At baseline, 253 lesions were collected from 228 patients; the median age of the overall study population was 56 years (IQR, 55-69). The study included 140 lesions from female patients and 112 from males. Lesions were most commonly Fitzpatrick skin type II (n = 165) followed by skin type III (n = 65), type I (n = 20), and type IV (n = 13). The median lesion size was 6 mm (IQR, 4-10); lesions were collected from the posterior torso (n = 82), anterior torso (n = 54), lower extremities (n = 38), upper extremities (n = 24), head and neck (n = 21), face (n = 16), lateral torso (n = 11), palms/soles (n = 5), and groin/genital region (n = 2).1
Additional findings from the study showed that Dermalyzer displayed a positive predictive value of 35.9% and a negative predictive value of 99.5% compared with the predefined cutoff level positive and negative predictive values of 17.9% and 99.3%, respectively. The primary care physician’s degree of melanoma suspicion alone had positive and negative predictive values of 23.5% and 95.5%, respectively. When combined, the app’s outcome and the physician’s decision making demonstrated a positive predictive value of 31.6% and a negative predictive value of 99.2%. Both the app’s indication that evidence of melanoma was present (OR, 26.55; 95% CI, 3.29-213.96; P = .002) and a high degree of melanoma suspicion noted by the primary care physician (OR, 3.35; 95% CI, 1.19-9.44; P = .02) were associated with predicting a diagnosis of melanoma.
Of the lesions that had a high degree of physician suspicion and the app indicated had evidence of melanoma detected (n = 38), 12 had a final diagnosis of melanoma and 26 were nonmelanoma. None of the lesions with a high degree of physician suspicion and no evidence of melanoma per the app (n = 13) were determined to be melanoma upon final diagnosis. Further, 8 patients were determined to have melanoma upon final diagnosis among those who had a low degree of physician suspicion, but the app classified as having evidence of melanoma (n = 74). All but 1 of the lesions that were deemed to be of low suspicion and that the app determined had no evidence of melanoma (n = 128) received a nonmelanoma final diagnosis.
“Despite a large number of studies on the ability of AI algorithms to recognize skin cancer by dermoscopic images, surprisingly few of these have a prospective study design. Most algorithms have been tested on varying numbers of already sampled images, often comparing AI performance with that of a group of clinicians,” study authors wrote in conclusion. “Further research, preferably with a randomized study design, is warranted to determine the tool’s actual usefulness and diagnostic safety over time.”