MAKE, Vol. 7, Pages 117: Learning to Partition: Dynamic Deep Neural Network Model Partitioning for Edge-Assisted Low-Latency Video Analytics


MAKE, Vol. 7, Pages 117: Learning to Partition: Dynamic Deep Neural Network Model Partitioning for Edge-Assisted Low-Latency Video Analytics

Machine Learning and Knowledge Extraction doi: 10.3390/make7040117

Authors:
Yan Lyu
Likai Liu
Xuezhi Wang
Zhiyu Fan
Jinchen Wang
Guanyu Gao

In edge-assisted low-latency video analytics, a critical challenge is balancing on-device inference latency against the high bandwidth costs and network delays of offloading. Ineffectively managing this trade-off degrades performance and hinders critical applications like autonomous systems. Existing solutions often rely on static partitioning or greedy algorithms that optimize for a single frame. These myopic approaches adapt poorly to dynamic network and workload conditions, leading to high long-term costs and significant frame drops. This paper introduces a novel partitioning technique driven by a Deep Reinforcement Learning (DRL) agent on a local device that learns to dynamically partition a video analytics Deep Neural Network (DNN). The agent learns a farsighted policy to dynamically select the optimal DNN split point for each frame by observing the holistic system state. By optimizing for a cumulative long-term reward, our method significantly outperforms competitor methods, demonstrably reducing overall system cost and latency while nearly eliminating frame drops in our real-world testbed evaluation. The primary limitation is the initial offline training phase required by the DRL agent. Future work will focus on extending this dynamic partitioning framework to multi-device and multi-edge environments.



Source link

Yan Lyu www.mdpi.com