دانشگاه صنعتی کرمانشاه

مشخصات پژوهش

عنوان	A reinforcement learning algorithm for optimal dynamic polcies of joint condition based maintenance and production
نوع پژوهش	مقاله ارائه شده
کلیدواژه‌ها	condition-based maintenance; condition-based production, reinforcement learning, Markov decision process
پژوهشگران	حسن رسائی (نفر اول)، فریبا عزیزی (نفر دوم)، مهرناز سلمانی (نفر سوم)، فرنوش نادرخانی (نفر چهارم)

چکیده

This paper focuses on development of joint optimal maintenance and production policy for a specific type of production system that allows for adjustable production rates. The rate of deterioration of the system is directly related to the production rate, with higher production rates resulting in greater expected deterioration. The system’s deterioration can be controlled through two main actions: (1) scheduling and conducting maintenance actions referred to as maintenance policy; and (2) adjusting the production rate referred to as production policy. To determine the optimal actions given the system’s state, a Markov decision process (MDP) is developed and a reinforcement learning algorithm, specifically a Q-learning algorithm, is utilized. The algorithm’s hyper parameters are tuned using a value-iteration algorithm of dynamic programming. The goal is to minimize expected costs for the system over a finite planning horizon

حسن رسائی

مشخصات پژوهش

چکیده