Saijal Singhal

Multimodel AI: Lesion Detection

2024

Master's Course Project

ResNet-18

Fusion Model

View Project Details

2024

Master's Course Project

ResNet-18

Fusion Model

View Project Details

2024

Master's Course Project

ResNet-18

Fusion Model

View Project Details

Exploring multimodal deep learning for skin-cancer detection by combining image and clinical data through fusion-based models.

Image Model

A ResNet-18 CNN trained on dermoscopic images to identify visual lesion patterns such as texture and color. Leveraging transfer learning from ImageNet, this model learns distinct spatial cues from skin imagery.

Metadata Model

A Multi-Layer Perceptron built on patient metadata (age, lesion site, and color irregularity).
It highlights how structured clinical data alone can effectively predict lesion types

Fusion Model

The fusion-based approach combines image and metadata features using both early fusion and late fusion techniques:

Early Fusion: The model is designed to handle both image and metadata inputs as one.
Late Fusion: The Late Fusion architecture integrated a Multilayer Perceptron (MLP) for processing metadata and a ResNet-18 backbone for image feature extraction. Image and metadata were separately processed before merging their embeddings for final classification.

Conclusion

The Late Fusion model outperformed all others, showing that combining independent feature representations leads to better generalization. Interestingly, metadata alone proved highly predictive, emphasizing its diagnostic value.