Smart Maintenance Optimization for Large Scale Parallel Systems Using Deep Reinforcement Learning

Research

Title	Smart Maintenance Optimization for Large Scale Parallel Systems Using Deep Reinforcement Learning
Type	JournalPaper
Keywords	Deep reinforcement learning (DRL) , dynamic maintenance , machine learning , manufacturing parallel systems , Markov decision process (MDP)
Year	2025
Journal	IEEE TRANSACTIONS ON RELIABILITY
DOI
Researchers	Fariba Azizi ، Zahra Rezvani ، Hasan Rasay ، Abdollah Safari ، Mehrnaz Salmani ، Farnoosh Naderkhani

Abstract

In today's era of Industry 4.0, with the unprecedented availability of data and advancements in technology, it is imperative to adopt smart and dynamic maintenance scheduling, especially for large-scale systems, to harness optimal operational efficiency. In this regard, this article presents a machine learning-based maintenance decision-making framework for multiunit systems. Specifically, we apply deep reinforcement learning (DRL) to a dynamic maintenance model designed for a multiunit parallel system subject to stochastic degradation and random failures. Each unit deteriorates independently through a three-state homogeneous Markov process, transitioning between healthy, unhealthy, or failed states. We define the overall system state by combining individual component states and model their interactions using the bivariate birth/birth–death process. To minimize costs, we use the Markov decision process framework to solve the optimal maintenance policy. We evaluate and compare advanced DRL methods, including proximal policy optimization (PPO) and double deep Q-networks (DDQN), against several baseline approaches. The results show that PPO consistently outperforms all methods, providing the most effective and reliable maintenance strategies. While DDQN performs better than some baseline methods, it occasionally falls short compared to others. These findings highlight the strengths and limitations of different reinforcement learning techniques in determining optimal maintenance policies. In addition, we provide a numerical example that illustrates the use of reinforcement learning methods in a practical scenario, emphasizing the scalability and efficiency of our proposed framework for large-scale systems. Our results show exemplary performance in optimizing maintenance strategies and contribute to the advancement of smart maintenance solutions for complex industrial systems.

Hasan Rasay

Research

Abstract