Abstract
Background: Ear diseases such as wax impaction, tympanic membrane (TM) perforations, and infections are common and can cause hearing loss, particularly in low-resource areas. Conventional otoscopic diagnosis is difficult for non-specialists, prompting research into smartphone-based deep learning (DL) otoscopy to improve diagnostic accuracy and accessibility. This study assessed a smartphone-integrated DL system for classifying ear findings (wax impaction, TM perforation, infection, and normal TM) in an outpatient setting at Bolan Medical College in Quetta. Methods: We conducted a 6-month prospective study with 80 patients (<100 as a pilot sample) who presented with ear complaints. A smartphone-attached digital otoscope was used to capture otoscopic images, which were then analyzed using a deep learning model (YOLOv5 object detection and EfficientNet classification). The model was trained on an augmented dataset of 320 images (80 per category) using transfer learning. ENT specialists established ground truth diagnoses. We calculated sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for each category and compared them to previously published results. Results: The median age was 29 years (range 5-64), 56% were female, and 65% came from rural areas. Infection was the most common diagnosis (35%), followed by wax impaction (27.5%), TM perforation (22.5%), and normal TM (15%). The DL model had a total accuracy of 91.3%, correctly classifying 73 of 80 cases. Sensitivity was 91.7% for normal TM, 95.5% for wax impaction, 88.9% for TM perforation, and 89.3% for infection, with specificity ranging from 94.2% to 100% (Table 1). The PPV and NPV were high across all categories (Table 1). Figure 2 shows the model's sensitivity and specificity by category (). The diagnostic performance for wax and perforation was particularly strong, with no false positives (PPV = 100%). The model performed slightly worse in detecting infections, with a few otitis cases misclassified as normal or perforated. Conclusion: This pilot study demonstrates the viability of smartphone-based DL otoscopy in a low-resource clinical environment. The model achieved diagnostic accuracy comparable to expert evaluation for major ear conditions. Implementing AI-assisted otoscopy could potentially increase access to early ear disease detection in rural and underserved areas. More research with larger multicenter trials are needed to validate and refine the model, incorporate tympanic membrane segmentation, and address real-world issues like variable image quality and diverse pathologies. Our findings back up the promise of smartphone DL otoscopy as an affordable tool for improving ear care equity and outcomes.