Kermanshah University of Technology

Research

Title	A reinforcement learning algorithm for optimal dynamic polcies of joint condition based maintenance and production
Type	Presentation
Keywords	condition-based maintenance; condition-based production, reinforcement learning, Markov decision process
Researchers	Hasan Rasay، Fariba Azizi، Mehrnaz Salmani، Farnoosh Naderkhani

Abstract

This paper focuses on development of joint optimal maintenance and production policy for a specific type of production system that allows for adjustable production rates. The rate of deterioration of the system is directly related to the production rate, with higher production rates resulting in greater expected deterioration. The system’s deterioration can be controlled through two main actions: (1) scheduling and conducting maintenance actions referred to as maintenance policy; and (2) adjusting the production rate referred to as production policy. To determine the optimal actions given the system’s state, a Markov decision process (MDP) is developed and a reinforcement learning algorithm, specifically a Q-learning algorithm, is utilized. The algorithm’s hyper parameters are tuned using a value-iteration algorithm of dynamic programming. The goal is to minimize expected costs for the system over a finite planning horizon

Hasan Rasay

Research

Abstract