SwinVQA: A Transformer Framework for Medical Visual Question Answering

Keyur Vaitha

doi:10.52783/cana.v32.6199

PDF

Published: Jun 30, 2025

DOI: https://doi.org/10.52783/cana.v32.6199

Keyur Vaitha, Jay Korat, Anand Akbari, Chinmay Raut

Abstract

The integration of advanced computer vision and natural language processing techniques in medical image analysis presents significant challenges due to the complexity and high stakes of diagnostic interpretation. This paper introduces SwinVQA, a novel medical visual question answering framework that leverages the hierarchical architecture of Swin Transformers to address limitations in existing systems. By employing shifted window partitioning and patch merging techniques, SwinVQA efficiently processes high-resolution medical images while simultaneously capturing both local details and global context essential for accurate diagnosis. We implement a sophisticated cross-modal attention mechanism that effectively aligns visual features with clinical queries, enhancing the model’s reasoning capabilities. The framework is evaluated on established medical VQA datasets, including SLAKE and VQA-RAD, demonstrating improved performance across various question types and imaging modalities. Additionally, we introduce beam search optimization for answer generation, resulting in more contextually appropriate and diagnostically accurate responses. Experimental results show that SwinVQA significantly outperforms baseline models in both computational efficiency and diagnostic accuracy. This research advances the field of AI-assisted medical image analysis by providing a more robust and clinically relevant solution that bridges the gap between technological capabilities and practical healthcare applications.

Issue

Vol. 32 No. 9s (2025)

Section

Articles

Announcements

Call for Papers

Call for Papers for the Upcoming Issue.

Last Date of Submission: April 30^th, 2026

Call for Reviewers

Call for Editorial Member/ Reviewers Submitting your Application
If you would like to apply for the position of an Editorial Board Member on the journal, please contact the Editor including your CV and a brief covering letter detailing why you are a suitable candidate, to editor@internationalpubls.com. Your cover letter should be no longer than one page and should cover where you believe the research field is going (and the journal's place within it), as well as details of any previous relevant journal editorial and peer review management experience.