28 فروردین 1403

حسن رسائی

مرتبه علمی: استادیار
نشانی:
تحصیلات: دکترای تخصصی / مهندسی صنایع
تلفن: 38305005
دانشکده: دانشکده مدیریت مهندسی

مشخصات پژوهش

عنوان
A reinforcement learning algorithm for optimal dynamic polcies of joint condition based maintenance and production
نوع پژوهش مقاله ارائه شده
کلیدواژه‌ها
condition-based maintenance; condition-based production, reinforcement learning, Markov decision process
پژوهشگران حسن رسائی (نفر اول)، فریبا عزیزی (نفر دوم)، مهرناز سلمانی (نفر سوم)، فرنوش نادرخانی (نفر چهارم)

چکیده

This paper focuses on development of joint optimal maintenance and production policy for a specific type of production system that allows for adjustable production rates. The rate of deterioration of the system is directly related to the production rate, with higher production rates resulting in greater expected deterioration. The system’s deterioration can be controlled through two main actions: (1) scheduling and conducting maintenance actions referred to as maintenance policy; and (2) adjusting the production rate referred to as production policy. To determine the optimal actions given the system’s state, a Markov decision process (MDP) is developed and a reinforcement learning algorithm, specifically a Q-learning algorithm, is utilized. The algorithm’s hyper parameters are tuned using a value-iteration algorithm of dynamic programming. The goal is to minimize expected costs for the system over a finite planning horizon