Skin cancer detection using deep learning
Deep learning architectures, particularly convolutional neural networks (CNNs), have demonstrated the ability to classify skin lesions with accuracy comparable to or exceeding board-certified dermatologists across various diagnostic tasks. These systems utilize transfer learning, advanced feature extraction, and optimization algorithms to identify malignancies such as melanoma, basal cell carcinoma, and squamous cell carcinoma.
Core Architectures and Training Methodologies
- Transfer Learning and Inception Models: A CNN based on the Inception v3 architecture, pretrained on 1.28 million images from ImageNet, was fine-tuned on a dataset of 129,450 clinical images organized by a novel taxonomy of 2,032 diseases (Direct, High; PMID: 28117445).
- Custom ResNet Architectures: A custom-designed ResNet-65 model, comprising 65 layers and 6 residual blocks, achieved a maximum accuracy of 99.7% using a Wide Neural Network classifier on ISIC dataset images (Direct, High; DOI: 10.65278/ijtaci.2025.31).
- Explainable Deep CNNs (ECNN): A Grid-Based Structural and Dimensional (GBSD) ECNN leverages self-feature selection and the Adaptive Intelligent Coney Optimization (AICO) algorithm to improve convergence speed and diagnostic accuracy, reaching 96% accuracy on the ISIC dataset (Direct, High; PMID: 38338828).
Diagnostic Accuracy and Clinical Performance
- Dermatologist Comparison: In a study involving 21 board-certified dermatologists, a CNN matched human performance in melanoma classification and keratinocyte carcinoma detection, achieving an Area Under the Curve (AUC) of over 91% for these tasks (Direct, High; PMID: 28117445).
- Mohs Surgery Integration: A deep learning algorithm for basal cell carcinoma (BCC) detection demonstrated an AUC of 0.97 in identifying positive tumor margins, facilitating real-time tumor mapping during Mohs micrographic surgery (Direct, High; PMID: 38651039).
- Squamous Cell Carcinoma Detection: For cutaneous squamous cell carcinoma (cSCC), deep learning models identified tumor presence at 50-micron resolution; however, histomorphology alone was found insufficient for well-differentiated tumors, requiring the inclusion of broader tissue architectural context for accurate delineation (Direct, High; PMID: 37293008).
- Multiclass Versatility: Beyond melanoma and carcinomas, deep learning models have been applied to classify common conditions such as acne, psoriasis, eczema, and rosacea (Direct, High; PMID: 38809652).
Optimization and Preprocessing Techniques
- Noise Reduction: Preprocessing pipelines often include noise reduction for hair removal, utilizing Gaussian and median blur filters to accentuate infected regions and minimize errors caused by non-lesion artifacts (Direct, High; PMID: 38809652, PMID: 38338828).
- Feature Extraction Enhancements: Advanced systems incorporate grid-based structural patterns, such as Local Binary Patterns (LBP) and Local Directional Patterns (LDP), to maintain robustness against illumination changes and noisy image environments (Direct, High; PMID: 38338828).
- Optimization Algorithms: Hyperparameter tuning using nature-inspired algorithms like AICO or the Adaptive Intelligent Coney Optimization has been shown to reduce computational time to as low as 0.55 seconds for classification tasks (Direct, High; PMID: 38338828).
Clinical Workflow and Operational Efficiency
- Workflow Parallelization: Implementation of AI in Mohs surgery workflows is estimated to save approximately 35.7% of slide waiting time and nearly 18% of staff and histotechnician time per day by providing earlier diagnostic insights (Direct, High; PMID: 38651039).
- Decision Support: Saliency maps and Grad-CAM++ are utilized to provide visual explanations of the pixels influencing CNN predictions, increasing the transparency of the "black box" model for clinical practitioners (Direct, High; PMID: 38338828).
- Deployment: These diagnostic methods are designed to be scalable and deployable on mobile devices, potentially expanding access to primary care screenings and augmenting decision-making for specialists (Direct, High; PMID: 28117445, PMID: 38809652).
Unverified Citations
The following sources failed to support their assigned claims after 3 verification rounds designed to ensure only high-confidence, relevant references are retained:
- PMID:38809652 — 1% on the ISIC 2020 challenge dataset
Failed: conclusion — The claim asserts a specific accuracy of '1%' (likely a typo in the user prompt or claim) while the paper reports 84.4% top-1 and 97.1% top-5 accuracy. - PMID:28117445 — ** Decision Support: Saliency maps and Grad-CAM++ are utilized to provide visual explanations of the pixels influen...*
Failed: entities,conclusion — The entity 'Grad-CAM++' is not mentioned in the provided text for Paper 2; the paper uses gradient-based saliency maps but not the specific Grad-CAM++ algorithm.
Based on the provided research, the performance differences between EfficientNet B0-B7 and custom ResNet architectures in multiclass skin cancer classification are characterized by trade-offs between model size, parameter efficiency, and raw diagnostic accuracy.
EfficientNet B0-B7 Performance Dynamics
- Scaling and Accuracy: EfficientNet models (B0 through B7) demonstrate a consistent improvement in classification and segmentation accuracy as model complexity increases (Direct, High; PMID: 38809652).
- Top-Tiers Accuracy: On the ISBI-2016 dataset, the Top-1 accuracy ranges from 81.0% for the B0 model to 84.4% for the B7 model. The Top-5 accuracy for the B7 variant reaches 97.1% (Direct, High; PMID: 38809652).
- Parameter Efficiency: EfficientNet-B7 is noted for its efficiency, being smaller than leading existing CNNs while maintaining high accuracy, making it suitable for deployment on mobile devices (Direct, High; PMID: 38809652).
- Class-Specific Accuracy: While Benign Keratosis classification exceeded 87% accuracy across EfficientNet models, the melanoma class reached an average accuracy of 80.51%. Substantial performance variations were noted in predicting specific conditions like warts (90.7%) and psoriasis (84.2%) (Direct, High; PMID: 38809652).
Custom ResNet Architecture Performance
- ResNet-65 Capabilities: A custom ResNet-65 architecture, featuring 65 layers and 6 residual blocks with 151 million learnable parameters, achieved a maximum accuracy of 99.7% on the ISIC dataset (Direct, High; DOI: 10.65278/ijtaci.2025.31).
- Trade-off Metrics: While ResNet-65 provides superior accuracy, models like ResNet-50 and DenseNet are described as faster and more efficient, though they offer comparatively lower accuracy (Direct, High; DOI: 10.65278/ijtaci.2025.31).
Comparative Outcomes in Multiclass Classification
- Diagnostic Precision: Custom-designed residual networks (like ResNet-65) have reported significantly higher peak accuracies (99.7%) compared to the EfficientNet B0-B7 suite (max 84.4% Top-1), although these results stem from different study methodologies and specific dataset partitions (Derived, Medium; PMID: 38809652, DOI: 10.65278/ijtaci.2025.31).
- Computational Cost vs. Performance: EfficientNet-B7 demonstrates a diminishing accuracy gap between performance tiers as complexity increases, suggesting it is a robust choice for broad segmentation scenarios, provided the computational cost of higher-complexity models is acceptable (Direct, High; PMID: 38809652).
Unverified Citations
The following sources failed to support their assigned claims after 3 verification rounds designed to ensure only high-confidence, relevant references are retained:
- PMID:38809652 — 4 times smaller than the leading existing CNNs while maintaining high accuracy, making it suitable for deployment on mob...
Failed: conclusion — The paper states the model is 8.4 times smaller, whereas the claim asserts it is 4 times smaller, a significant quantitative discrepancy. - PMID:38338828 — 7% on the ISIC dataset using a Wide Neural Network classifier
Failed: entities,conclusion — The paper reports an accuracy of 96% for its proposed model, not the 7% mentioned in the claim, and does not use a 'Wide Neural Network' classifier.
Possible alternatives (unverified): PMID:28117445 (40% topic match) - DOI:10.65278/ijtaci.2025.31 — 7% on the ISIC dataset using a Wide Neural Network classifier
Failed: conclusion — The paper reports 99.7% accuracy for the Wide Neural Network on the ISIC dataset, which contradicts the claim's value of 7%.
Possible alternatives (unverified): PMID:28117445 (40% topic match) - DOI:10.65278/ijtaci.2025.31 — 4% Top-1), although these results stem from different study methodologies and specific dataset partitions
Failed: conclusion — The paper reports a maximum accuracy of 99.7% for its specific model/partition, which does not support the claim's '4% Top-1' figure. - PMID:38338828 — 86% by utilizing nature-inspired optimization for hyperparameter tuning
Failed: conclusion — The paper reports an accuracy of 96% (0.96), not 86% as asserted in the claim.
According to the provided context, Grad-CAM (and its advanced variant, Grad-CAM++) is considered a useful tool for enhancing the transparency and clinical utility of deep learning models in dermatology, though standard versions have reported limitations in interpretability.
Usefulness of Grad-CAM and Grad-CAM++
- Visual Explainability: Grad-CAM++ is utilized to visualize decisions made by Convolutional Neural Networks (CNNs), providing class-discriminative localization that makes the models more explainable and transparent to clinicians (Direct, High; PMID: 38338828) «✓ PMID:38338828».
- Clinical Trust: By generating visual explanations, these techniques help foster trust and acceptance of AI-driven technologies in medical practice (Direct, High; PMID: 38338828) «✓ PMID:38338828».
- Implementation Ease: Grad-CAM++ is highly useful because it generates visual explanations for any CNN architecture without requiring modifications to the architecture or model re-training (Direct, High; PMID: 38338828) «✓ PMID:38338828».
- Overfitting Mitigation: In advanced frameworks, the incorporation of Grad-CAM++ helps solve overfitting issues by allowing for self-feature selection and visual verification of lesion localization (Direct, High; PMID: 38338828) «✓ PMID:38338828».
- Early Diagnosis: Previous research utilized Grad-CAM in optimized CNN models to address typical skin cancer problems and assist doctors in early diagnosis (Indirect, Medium; PMID: 38338828).
Limitations and Refinements
- Low Interpretability Scores: Despite its utility, standard Grad-CAM models may suffer from low interpretability scores. To address this, some researchers suggest that additional components, such as transformers, are needed to improve the score (Indirect, Medium; PMID: 38338828).
- Technological Evolution: To overcome the limits of standard Grad-CAM, modern architectures now fuse Grad-CAM++ with other modules like Full Grad to produce more refined saliency maps and guided propagation images (Direct, High; PMID: 38338828) «✓ PMID:38338828».
Summary of Clinical Role
In the context of skin cancer classification, Grad-CAM and its derivatives are established as essential for bridging the "black box" nature of deep learning. While standard Grad-CAM provides a baseline for lesion localization, the literature suggests that Grad-CAM++ is superior for clinical application because it provides higher-quality visual explanations without the need for model alterations (Derived, Medium; PMID: 38338828) «✓ PMID:38338828».
Deep learning models utilizing the EfficientNet-B3 architecture and Grad-CAM (Gradient-weighted Class Activation Mapping) provide a framework for achieving high diagnostic accuracy while maintaining interpretability in skin cancer detection. EfficientNet-B3 offers a balanced scaling of network depth and width for feature extraction, while Grad-CAM techniques visualize the specific lesion areas driving the model's predictions.
Performance of EfficientNet-B3 in Skin Lesion Analysis
- Segmentation Accuracy: In evaluations using the ISBI-2016 test dataset, the EfficientNet-B3 model achieved a Top-1 segmentation accuracy of 83.4% and a Top-5 accuracy of 93.2% (Direct, High; PMID: 38809652).
- Architecture Scaling: EfficientNet architectures like B3 are specifically selected for their ability to scale depth, width, and image resolution effectively, which optimizes the trade-off between computational efficiency and the ability to capture intricate dermatological features (Direct, High; PMID: 38809652).
Interpretability and Saliency Mapping with Grad-CAM
- Clinical Utility: Grad-CAM has been utilized in optimized CNN models to assist clinicians in early skin cancer diagnosis by highlighting relevant pathological features (Indirect, Medium; PMID: 38338828).
- Interpretability Challenges: Standard Grad-CAM models sometimes report low interpretability scores, which can necessitate the integration of additional transformers or advanced variants like Grad-CAM++ to improve visual explanations (Indirect, Medium; PMID: 38338828).
- Grad-CAM++ Enhancements: Unlike some visualization techniques, Grad-CAM++ provides class-discriminative localization without requiring architectural changes or model re-training, fostering greater transparency and trust in automated diagnostic decisions (Direct, High; PMID: 38338828).
- Saliency Map Correlations: Saliency mapping (including Grad-CAM and its derivatives) ensures the network fixates on the lesions themselves—rather than healthy skin or background noise—by calculating pixel gradients with respect to the loss function (Derived, Medium; PMID: 28117445, PMID: 38338828).
Implementation in Clinical Workflows
- Preprocessing Requirements: To ensure Grad-CAM highlights the lesion accurately, images typically undergo preprocessing to remove artifacts like body hair, using techniques such as Gaussian blur and median blur to reduce noise (Direct, High; PMID: 38809652).
- Model Deployment: EfficientNet-based models, including B3, are recognized for being parameter-efficient, which facilitates their deployment on mobile devices for real-time clinical decision support (Direct, High; PMID: 38809652).
Unverified Citations
The following sources failed to support their assigned claims after 3 verification rounds designed to ensure only high-confidence, relevant references are retained:
- PMID:38809652 — 8% (Top-4) accuracy
Failed: conclusion — The claim specifies '8% (Top-4) accuracy' but the paper reports Top-4 accuracy for model B3 as 89.8% (and no model in the table has a value of 8%). - PMID:28117445 — ** Model Deployment: EfficientNet-based models, including B3, are recognized for being parameter-efficient, which f...*
Failed: entities — The paper (published in 2017) describes a CNN based on Inception v3 and does not mention EfficientNet or the B3 architecture, which were developed later.
Across the reported datasets, the Top-1 accuracy of EfficientNet-B3 is slightly lower than that of the higher-complexity B7 model, as accuracy consistently improves with increasing model complexity.
Accuracy Comparison on the ISBI-2016 Dataset
- Top-1 Accuracy: EfficientNet-B3 achieved a Top-1 accuracy of 83.4%, whereas EfficientNet-B7 demonstrated a higher accuracy of 84.4% (Direct, High; PMID: 38809652).
- Top-5 Accuracy: The performance gap is more pronounced in Top-5 metrics, where EfficientNet-B3 reached 93.2% compared to 97.1% for EfficientNet-B7 (Direct, High; PMID: 38809652) «✓ PMID:38809652».
- Incremental Gains: The transition from B3 to B7 shows a steady increase in accuracy across all tiers:
- B3: Top-2 (83.9%), Top-3 (87.2%), Top-4 (89.8%) (Direct, High; PMID: 38809652).
- B7: Top-2 (87.1%), Top-3 (91.1%), Top-4 (93.5%) (Direct, High; PMID: 38809652).
Performance Trends and Trade-offs
- Complexity Correlation: Data from the EfficientNet B0–B7 suite reveals a consistent improvement in segmentation accuracy as model complexity increases (Direct, High; PMID: 38809652) «✓ PMID:38809652».
- Model Robustness: While B7 provides superior accuracy across all tiers, the accuracy gap between different performance tiers (e.g., Top-1 vs. Top-5) diminishes as model complexity increases, indicating that higher-complexity models are more robust across a broader range of segmentation scenarios (Direct, High; PMID: 38809652) «✓ PMID:38809652».
- Computational Considerations: Despite the accuracy advantage of B7, researchers note that the computational cost associated with higher-complexity models must be carefully weighed for practical applications, whereas models like B3 offer a balance of precision and efficiency (Direct, High; PMID: 38809652) «✓ PMID:38809652».
Comparative Context with Other Architectures
- General Performance: EfficientNet models (B0–B7) are noted for being faster and more efficient than older architectures, though some custom-designed ResNet models (e.g., ResNet-65) have reported higher raw accuracy (up to 99.7%) in specific multiclass studies (Derived, Medium; PMID: 38809652, DOI: 10.65278/ijtaci.2025.31).
- Benchmarking: EfficientNet-B7 is highlighted as being 8.4 times smaller than the best existing CNNs while still achieving state-of-the-art results (84.4% Top-1) «✓ PMID:38809652» «✓ DOI:10.65278/ijtaci.2025.31» for its model class (Direct, High; PMID: 38809652) «✓ PMID:38809652».