Electronics, Vol. 15, Pages 617: Deep Reinforcement Learning-Based Experimental Scheduling System for Clay Mineral Extraction
Electronics doi: 10.3390/electronics15030617
Authors:
Bo Zhou
Lei He
Yongqiang Li
Zhandong Lv
Shiping Zhang
Efficient and non-destructive extraction of clay minerals is fundamental for shale oil and gas reservoir evaluation and enrichment mechanism studies. However, traditional manual extraction experiments face bottlenecks such as low efficiency and reliance on operator experience, which limit their scalability and adaptability to intelligent research demands. To address this, this paper proposes an intelligent experimental scheduling system for clay mineral extraction based on deep reinforcement learning. First, the complex experimental process is deconstructed, and its core scheduling stages are abstracted into a Flexible Job Shop Scheduling Problem (FJSP) model with resting time constraints. Then, a scheduling agent based on the Proximal Policy Optimization (PPO) algorithm is developed and integrated with an improved Heterogeneous Graph Neural Network (HGNN) to represent the relationships among operations, machines, and constraints. This enables effective capture of the complex topological structure of the experimental environment and facilitates efficient sequential decision-making. To facilitate future practical applicability, a four-layer system architecture is proposed, comprising the physical equipment layer, execution control layer, scheduling decision layer, and interactive application layer. A digital twin module is designed to bridge the gap between theoretical scheduling and physical execution. This study focuses on validating the core scheduling algorithm through realistic simulations. Simulation results demonstrate that the proposed HGNN-PPO scheduling method significantly outperforms traditional heuristic rules (FIFO, SPT), meta-heuristic algorithms (GA), and simplified reinforcement learning methods (PPO-MLP). Specifically, in large-scale problems, our method reduces the makespan by over 9% compared to the PPO-MLP baseline, and the algorithm runs more than 30 times faster than GA. This highlights its superior performance and scalability. This study provides an effective solution for intelligent scheduling in automated chemical laboratory workflows and holds significant theoretical and practical value for advancing the intelligentization of experimental sciences, including shale oil and gas research.
Source link
Bo Zhou www.mdpi.com
