Bioengineering, Vol. 12, Pages 565: Dual-Branch Network with Hybrid Attention for Multimodal Ophthalmic Diagnosis


Bioengineering, Vol. 12, Pages 565: Dual-Branch Network with Hybrid Attention for Multimodal Ophthalmic Diagnosis

Bioengineering doi: 10.3390/bioengineering12060565

Authors:
Xudong Wang
Anyu Cao
Caiye Fan
Zuoping Tan
Yuanyuan Wang

In this paper, we propose a deep learning model based on dual-branch learning with a hybrid attention mechanism for meeting challenges in the underutilization of features in ophthalmic image diagnosis and the limited generalization ability of traditional single modal deep learning models when using imbalanced data. Firstly, a dual-branch architecture layout is designed, in which the left and right branches use residual blocks to deal with the features of a 2D image and 3D volume, respectively. Secondly, a frequency domain transform-driven hybrid attention module is innovated, which consists of frequency domain attention, spatial attention, and channel attention, respectively, to solve the problem of inefficiency in network feature extraction. Finally, through a multi-scale grouped attention fusion mechanism, the local details and global structure information of the bimodal modalities are integrated, which solves the problem of the inefficiency of fusion caused by the heterogeneity of modal features. The experimental results show that the accuracy of MOD-Net improved by 1.66% and 1.14% over GeCoM-Net and ViT-2SPN, respectively. It can be concluded that the model effectively mines the deep correlation features of multimodal images through the hybrid attention mechanism, which provides a new paradigm for the intelligent diagnosis of ophthalmic diseases.



Source link

Xudong Wang www.mdpi.com