Kermanshah University of Technology

Research

Title	Reinforcement Learning based on Stochastic Dynamic Programming for Condition-based Maintenance of Deteriorating Production Processes
Type	Presentation
Keywords	Maintenance, Markov decision process, Dynamic programming, reinforcement learning
Researchers	Hasan Rasay، Farnoosh Naderkhani، Amir Mohammad Golmohammadi

Abstract

In this paper, a stochastic dynamic programming model is developed for maintenance planning on a deteriorating multistate production system. The quality of the bath/lot of items produced in each stage is employed as a condition monitoring for condition-based maintenance. The machine has m-1 operational states plus a non-operational state referred as the failure state. At the start of each stage, four actions are available for the management: (1) renew the system; (2) implement maintenance; (3) continue the production, and (4) inspect the system. It is assumed that the impact of the maintenance is imperfect which means after the maintenance, the system is restored to any non-worse states with known probabilities. As the system states change Markovianlly at the end of each stage, and the quality of the items produced depends on the system state, the system can be modeled based on a Markov decision process (MDP). As the MDP is the core of reinforcement learning, for the large-scale problem, it is discussed that the proposed stochastic dynamic programming can be employed to develop reinforcement learning algorithms. To this end, Q-learning algorithm is proposed.

Hasan Rasay

Research

Abstract