Plants, Vol. 14, Pages 2329: Tomato Leaf Disease Identification Framework FCMNet Based on Multimodal Fusion


Plants, Vol. 14, Pages 2329: Tomato Leaf Disease Identification Framework FCMNet Based on Multimodal Fusion

Plants doi: 10.3390/plants14152329

Authors:
Siming Deng
Jiale Zhu
Yang Hu
Mingfang He
Yonglin Xia

Precisely recognizing diseases in tomato leaves plays a crucial role in enhancing the health, productivity, and quality of tomato crops. However, disease identification methods that rely on single-mode information often face the problems of insufficient accuracy and weak generalization ability. Therefore, this paper proposes a tomato leaf disease recognition framework FCMNet based on multimodal fusion, which combines tomato leaf disease image and text description to enhance the ability to capture disease characteristics. In this paper, the Fourier-guided Attention Mechanism (FGAM) is designed, which systematically embeds the Fourier frequency-domain information into the spatial-channel attention structure for the first time, enhances the stability and noise resistance of feature expression through spectral transform, and realizes more accurate lesion location by means of multi-scale fusion of local and global features. In order to realize the deep semantic interaction between image and text modality, a Cross Vision–Language Alignment module (CVLA) is further proposed. This module generates visual representations compatible with Bert embeddings by utilizing block segmentation and feature mapping techniques. Additionally, it incorporates a probability-based weighting mechanism to achieve enhanced multimodal fusion, significantly strengthening the model’s comprehension of semantic relationships across different modalities. Furthermore, to enhance both training efficiency and parameter optimization capabilities of the model, we introduce a Multi-strategy Improved Coati Optimization Algorithm (MSCOA). This algorithm integrates Good Point Set initialization with a Golden Sine search strategy, thereby boosting global exploration, accelerating convergence, and effectively preventing entrapment in local optima. Consequently, it exhibits robust adaptability and stable performance within high-dimensional search spaces. The experimental results show that the FCMNet model has increased the accuracy and precision by 2.61% and 2.85%, respectively, compared with the baseline model on the self-built dataset of tomato leaf diseases, and the recall and F1 score have increased by 3.03% and 3.06%, respectively, which is significantly superior to the existing methods. This research provides a new solution for the identification of tomato leaf diseases and has broad potential for agricultural applications.



Source link

Siming Deng www.mdpi.com