Beverages, Vol. 11, Pages 80: From Words to Ratings: Machine Learning and NLP for Wine Reviews


Beverages, Vol. 11, Pages 80: From Words to Ratings: Machine Learning and NLP for Wine Reviews

Beverages doi: 10.3390/beverages11030080

Authors:
Iliana Ilieva
Margarita Terziyska
Teofana Dimitrova

Wine production is an important sector of the food industry in Bulgaria, contributing to both economic development and cultural heritage. The present study aims to show how natural language processing (NLP) and machine learning methods can be applied to analyze expert-written Bulgarian wine descriptions and to extract patterns related to wine quality and style. Based on a bilingual dataset of reviews (in Bulgarian and English), semantic analysis, classification, regression and clustering models were used, which combine textual and structured data. The descriptions were transformed into numerical representations using a pre-trained language model (BERT), after which algorithms were used to predict style categories and ratings. Additional sentiment and segmentation analyses revealed differences between wine types, and clustering identified thematic structures in the expert language. The comparison between predefined styles and automatically derived clusters was evaluated using metrics such as Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI). The resulting analysis shows that text descriptions contain valuable information that allows for automated wine profiling. These findings can be applied by a wide range of stakeholders—researchers, producers, retailers, and marketing specialists.



Source link

Iliana Ilieva www.mdpi.com