JTAER, Vol. 20, Pages 343: Multimodal Deep Learning Framework for Automated Usability Evaluation of Fashion E-Commerce Sites
Journal of Theoretical and Applied Electronic Commerce Research doi: 10.3390/jtaer20040343
Authors:
Nahed Alowidi
Effective website usability assessment is crucial for improving user experience, driving customer satisfaction, and ensuring business success, particularly in the competitive e-commerce sector. Traditional methods, such as expert reviews and user testing, are resource-intensive and often fail to fully capture the complex interplay between a site’s aesthetic design and its technical performance. This paper introduces an end-to-end multimodal deep learning framework that automates the usability assessment of fashion e-commerce websites. The framework fuses structured numerical indicators (e.g., load time, mobile compatibility) with high-level visual features extracted from full-page screenshots. The proposed framework employs a comprehensive set of visual backbones—including modern architectures such as ConvNeXt and Vision Transformers (ViT, Swin) alongside established CNNs—and systematically evaluates three fusion strategies: early fusion, late fusion, and a state-of-the-art cross-modal fusion strategy that enables deep, bidirectional interactions between modalities. Extensive experiments demonstrate that the cross-modal fusion approach, particularly when paired with a ConvNeXt backbone, achieves superior performance with a 0.92 accuracy and 0.89 F1-score, outperforming both unimodal and simpler fusion baselines. Model interpretability is provided through SHAP and LIME, confirming that the predictions align with established usability principles and generate actionable insights. Although validated on fashion e-commerce sites, the framework is highly adaptable to other domains—such as e-learning and e-government—via domain-specific data and light fine-tuning. It provides a robust, explainable benchmark for data-driven, multimodal website usability assessment and paves the way for more intelligent, automated user-experience optimization.
Source link
Nahed Alowidi www.mdpi.com

