Processes, Vol. 13, Pages 4012: Optimizing Intermittent Pumping Duration with a Physics–Data Dual-Driven CatBoost Model Enhanced by Bayesian and Attention Mechanisms


Processes, Vol. 13, Pages 4012: Optimizing Intermittent Pumping Duration with a Physics–Data Dual-Driven CatBoost Model Enhanced by Bayesian and Attention Mechanisms

Processes doi: 10.3390/pr13124012

Authors:
Chengming Zhang
Fuping Feng
Cong Zhang
Shiyuan Li
Junzhuzi Xie

Traditional oilfields face challenges such as high energy consumption, imprecise control, and lax management in mid-to-late development stages, leading to increased costs and reduced efficiency. To address these issues, this work aims to develop an intelligent optimization framework for intermittent pumping by explicitly integrating physical mechanisms with data-driven modeling. Specifically, we propose a data–physics dual-driven method that combines physics-based parameters derived from seepage mechanics with data-driven feature selection using Pearson correlation analysis to identify nine key production factors. An improved CatBoost regression framework is developed through systematic preprocessing, including data cleaning, cubic polynomial feature expansion, F-value screening, and Z-score normalization. The model is further enhanced using Bayesian hyperparameter optimization, a weight adaptation mechanism, and an attention-based multi-level architecture. The novelty of this work lies in the unified dual-driven optimization strategy and the enhanced CatBoost framework that jointly improve prediction accuracy and model generalization. Experimental results demonstrate that the proposed method can accurately predict pumping operation times. Compared with the original CatBoost model, the MAE of the large-interval model decreases by 56.94%, while that of the small-interval model decreases by 16.23%. In addition, the accuracy of the large-interval model increases by 4.1%, and that of the small-interval model increases by 1.22%. These improvements show that the enhanced CatBoost model significantly strengthens predictive performance. This approach provides a reliable basis for optimizing pumping schedules, reducing energy consumption, and promoting intelligent and refined oilfield management.



Source link

Chengming Zhang www.mdpi.com