28 فروردین 1403

حسن رسائی

مرتبه علمی: استادیار
نشانی:
تحصیلات: دکترای تخصصی / مهندسی صنایع
تلفن: 38305005
دانشکده: دانشکده مدیریت مهندسی

مشخصات پژوهش

عنوان
Reinforcement Learning based on Stochastic Dynamic Programming for Condition-based Maintenance of Deteriorating Production Processes
نوع پژوهش مقاله ارائه شده
کلیدواژه‌ها
Maintenance, Markov decision process, Dynamic programming, reinforcement learning
پژوهشگران حسن رسائی (نفر اول)، فرنوش نادرخانی (نفر دوم)، امیرمحمد گل محمدی (نفر سوم)

چکیده

In this paper, a stochastic dynamic programming model is developed for maintenance planning on a deteriorating multistate production system. The quality of the bath/lot of items produced in each stage is employed as a condition monitoring for condition-based maintenance. The machine has m-1 operational states plus a non-operational state referred as the failure state. At the start of each stage, four actions are available for the management: (1) renew the system; (2) implement maintenance; (3) continue the production, and (4) inspect the system. It is assumed that the impact of the maintenance is imperfect which means after the maintenance, the system is restored to any non-worse states with known probabilities. As the system states change Markovianlly at the end of each stage, and the quality of the items produced depends on the system state, the system can be modeled based on a Markov decision process (MDP). As the MDP is the core of reinforcement learning, for the large-scale problem, it is discussed that the proposed stochastic dynamic programming can be employed to develop reinforcement learning algorithms. To this end, Q-learning algorithm is proposed.