面向柔性作业车间动态调度的双系统强化学习方法

刘亚辉, 申兴旺, 顾星海, 彭涛, 鲍劲松, 张丹

doi:10.16183/j.cnki.jsjtu.2021.215

上海交通大学学报 >

2022 , Vol. 56 >Issue 9: 1262 - 1275

DOI: https://doi.org/10.16183/j.cnki.jsjtu.2021.215

机械与动力工程

面向柔性作业车间动态调度的双系统强化学习方法

展开

1.东华大学机械工程学院,上海 201620
2.浙江大学机械工程学院,杭州 310027

刘亚辉(1997-),女,河南省许昌市人,硕士生,从事认知制造、知识图谱、智能调度研究.

收稿日期: 2021-06-22

网络出版日期: 2022-10-09

基金资助

国家重点研发计划(2019YFB1706300)

收起

A Dual-System Reinforcement Learning Method for Flexible Job Shop Dynamic Scheduling

Expand

1. School of Mechanical Engineering, Donghua University, Shanghai 201620, China
2. School of Mechanical Engineering, Zhejiang University, Hangzhou 310027, China

Received date: 2021-06-22

Online published: 2022-10-09

Fold

摘要

航天结构件生产过程中批产任务与研发任务并存,个性化小批量研发生产任务导致紧急插单现象频发.为了保障任务如期完成,解决柔性作业车间面临的动态调度问题,以最小化设备平均负载和最小化总完工时间为优化目标,提出了感知-认知双系统驱动的双环深度Q网络方法.感知系统基于知识图谱实现对车间知识的表示并生成多维信息矩阵;认知系统将调度过程分别抽象为资源配置智能体和工序排序智能体两个阶段,分别对应两个优化目标,设计了车间状态矩阵对问题和约束进行描述,调度决策中分步骤引入动作指令;最后分别设计奖励函数实现资源配置决策和工序排序决策的评价.经某动力所航天壳体加工的实例验证和算法对比分析,验证了所提方法的优越性.

关键词： 感知-认知双系统; 双环深度Q网络; 动态调度; 知识图谱; 多智能体

本文引用格式

刘亚辉, 申兴旺, 顾星海, 彭涛, 鲍劲松, 张丹 . 面向柔性作业车间动态调度的双系统强化学习方法[J]. 上海交通大学学报, 2022 , 56(9) : 1262 -1275 . DOI: 10.16183/j.cnki.jsjtu.2021.215

Abstract

In the production process of aerospace structural parts, there coexist batch production tasks and research and development (R&D) tasks. Personalized small-batch R&D and production tasks lead to frequent emergency insertion orders. In order to ensure that the task is completed on schedule and to solve the flexible job shop dynamic scheduling problem, this paper takes minimization of equipment average load and total completion time as optimization goals, and proposes a dual-loop deep Q network (DL-DQN) method driven by a perception-cognition dual system. Based on the knowledge graph, the perception system realizes the representation of workshop knowledge and the generation of multi-dimensional information matrix. The cognitive system abstracts the scheduling process into two stages: resource allocation agent and process sequencing agent, corresponding to two optimization goals respectively. The workshop status matrix is designed to describe the problems and constraints. In scheduling decision, action instructions are introduced step by step. Finally, the reward function is designed to realize the evaluation of resource allocation decision and process sequence decision. Application of the proposed method in the aerospace shell processing of an aerospace institute and comparative analysis of different algorithms verify the superiority of the proposed method.

Key words： perception-cognition dual system; dual-loop deep Q network (DL-DQN); dynamic scheduling; knowledge graph; multi-agent

参考文献

[1]	JI S X, PAN S R, CAMBRIA E, et al. A survey on knowledge graphs: Representation, acquisition, and applications[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(2): 494-514.
[2]	ISSA S, ADEKUNLE O, HAMDI F, et al. Knowledge graph completeness: A systematic literature review[J]. IEEE Access, 2021, 9: 31322-31339.
[3]	CARVALHO A, CHOUCHENE A, LIMA T, et al. Cognitive manufacturing in industry 4.0 toward cognitive load reduction: A conceptual framework[J]. Applied System Innovation, 2020, 3(4): 55.
[4]	LU Y Q, XU X, WANG L H. Smart manufacturing process and system automation: A critical review of the standards and envisioned scenarios[J]. Journal of Manufacturing Systems, 2020, 56: 312-325.
[5]	ZHANG J, DING G F, ZOU Y S, et al. Review of job shop scheduling research and its new perspectives under Industry 4.0[J]. Journal of Intelligent Manufacturing, 2019, 30(4): 1809-1830.
[6]	巴智勇, 袁逸萍, 戴毅, 等. 考虑机器故障的作业车间调度方案鲁棒测度方法[J]. 计算机集成制造系统, 2020, 26(12): 3341-3349.
[6]	BA Zhiyong, YUAN Yiping, DAI Yi, et al. Robustness measurement approach of job shop scheduling with machine breakdowns[J]. Computer Integrated Manufacturing Systems, 2020, 26(12): 3341-3349.
[7]	方佳, 陆志强. 考虑设备故障的鲁棒调度计划模板的建模优化[J]. 上海交通大学学报, 2020, 54(12): 1278-1290.
[7]	FANG Jia, LU Zhiqiang. Modeling and optimization of robust scheduling template considering equipment failure[J]. Journal of Shanghai Jiao Tong University, 2020, 54(12): 1278-1290.
[8]	GAO K Z, YANG F J, ZHOU M C, et al. Flexible job-shop rescheduling for new job insertion by using discrete jaya algorithm[J]. IEEE Transactions on Cybernetics, 2019, 49(5): 1944-1955.
[9]	王鹏飞. 群智能优化算法及在流水车间调度问题中的应用研究[D]. 长春: 吉林大学, 2019.
[9]	WANG Pengfei. Swarm intelligence optimization algorithm and its application in flow shop scheduling problem[D]. Changchun: Jilin University, 2019.
[10]	ZHOU B, BAO J S, LI J, et al. A novel knowledge graph-based optimization approach for resource allocation in discrete manufacturing workshops[J]. Robotics and Computer-Integrated Manufacturing, 2021, 71: 102160.
[11]	CHAKRABORTTY R K, RAHMAN H F, RYAN M J. Efficient priority rules for project scheduling under dynamic environments: A heuristic approach[J]. Computers & Industrial Engineering, 2020, 140: 106287.
[12]	蒋小康, 张朋, 吕佑龙, 等. 基于混合蚁群算法的半导体生产线炉管区调度方法[J]. 上海交通大学学报, 2020, 54(8): 792-804.
[12]	JIANG Xiaokang, ZHANG Peng, LYU Youlong, et al. Hybrid ant colony algorithm for batch scheduling in semiconductor furnace operation[J]. Journal of Shanghai Jiao Tong University, 2020, 54(8): 792-804.
[13]	王金凤, 陈璐, 杨雯慧. 考虑设备可用性约束的单机调度问题[J]. 上海交通大学学报, 2021, 55(1): 103-110.
[13]	WANG Jinfeng, CHEN Lu, YANG Wenhui. A single machine scheduling problem considering machine availability constraints[J]. Journal of Shanghai Jiao Tong University, 2021, 55(1): 103-110.
[14]	杜轩, 潘志成. 聚类差分进化算法求解多目标工艺规划与调度集成问题[J]. 计算机集成制造系统, 2019, 25(7): 1729-1738.
[14]	DU Xuan, PAN Zhicheng. Clustering and differential evolution algorithm for solving multi-objectives IPPS problem[J]. Computer Integrated Manufacturing Systems, 2019, 25(7): 1729-1738.
[15]	李聪波, 沈欢, 李玲玲, 等. 面向能耗的多工艺路线柔性作业车间分批优化调度模型[J]. 机械工程学报, 2017, 53(5): 12-23.
[15]	LI Congbo, SHEN Huan, LI Lingling, et al. A batch splitting flexible job shop scheduling model for energy saving under alternative process plans[J]. Journal of Mechanical Engineering, 2017, 53(5): 12-23.
[16]	PENG C, WU G L, LIAO T W, et al. Research on multi-agent genetic algorithm based on tabu search for the job shop scheduling problem[J]. PLoS One, 2019, 14(9): e0223182.
[17]	KUNDAKCI N, KULAK O. Hybrid genetic algorithms for minimizing makespan in dynamic job shop scheduling problem[J]. Computers & Industrial Engineering, 2016, 96: 31-51.
[18]	SHEN X N, YAO X. Mathematical modeling and multi-objective evolutionary algorithms applied to dynamic flexible job shop scheduling problems[J]. Information Sciences, 2015, 298: 198-224.
[19]	WANG Z, ZHANG J H, YANG S X. An improved particle swarm optimization algorithm for dynamic job shop scheduling problems with random job arrivals[J]. Swarm and Evolutionary Computation, 2019, 51: 100594.
[20]	张洁, 张朋, 刘国宝. 基于两阶段蚁群算法的带非等效并行机的作业车间调度[J]. 机械工程学报, 2013, 49(6): 136-144.
[20]	ZHANG Jie, ZHANG Peng, LIU Guobao. Two-stage ant colony algorithm based job shop scheduling with unrelated parallel machines[J]. Journal of Mechanical Engineering, 2013, 49(6): 136-144.
[21]	周亚勤, 杨长祺, 吕佑龙, 等. 双资源约束的航天结构件车间生产调度方法[J]. 机械工程学报, 2018, 54(9): 55-63.
[21]	ZHOU Yaqin, YANG Changqi, LÜ Youlong, et al. Scheduling the production of aerospace structural parts with dual resource constraints[J]. Journal of Mechanical Engineering, 2018, 54(9): 55-63.
[22]	汪浩祥, 严洪森, 汪峥. 知识化制造环境中基于双层Q学习的航空发动机自适应装配调度[J]. 计算机集成制造系统, 2014, 20(12): 3000-3010.
[22]	WANG Haoxiang, YAN Hongsen, WANG Zheng. Adaptive assembly scheduling of aero-engine based on double-layer Q-learning in knowledgeable manufacturing[J]. Computer Integrated Manufacturing Systems, 2014, 20(12): 3000-3010.
[23]	WEI Y, PAN L, LIU S J, et al. DRL-scheduling: An intelligent QoS-aware job scheduling framework for applications in clouds[J]. IEEE Access, 2018, 6: 55112-55125.
[24]	WANG Y D, LIU H, ZHENG W B, et al. Multi-objective workflow scheduling with deep-Q-network-based multi-agent reinforcement learning[J]. IEEE Access, 2019, 7: 39974-39982.
[25]	LUO S. Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning[J]. Applied Soft Computing, 2020, 91: 106208.
[26]	HE Z L, TRAN K P, THOMASSEY S, et al. Multi-objective optimization of the textile manufacturing process using deep-Q-network based multi-agent reinforcement learning[J]. Journal of Manufacturing Systems, 2022, 62: 939-949.
[27]	林时敬, 徐安军, 刘成, 等. 基于深度强化学习的炼钢车间天车调度方法[J]. 中国冶金, 2021, 31(3): 37-43.
[27]	LIN Shijing, XU Anjun, LIU Cheng, et al. Crane scheduling method in steelmaking workshop based on deep reinforcement learning[J]. China Metallurgy, 2021, 31(3): 37-43.
[28]	BRANDIMARTE P. Routing and scheduling in a flexible job shop by tabu search[J]. Annals of Operations Research, 1993, 41(3): 157-183.
[29]	喻鹏, 张俊也, 李文璟, 等. 移动边缘网络中基于双深度Q学习的高能效资源分配方法[J]. 通信学报, 2020, 41(12): 148-161.
[29]	YU Peng, ZHANG Junye, LI Wenjing, et al. Energy-efficient resource allocation method in mobile edge network based on double deep Q-learning[J]. Journal on Communications, 2020, 41(12): 148-161.
[30]	牟乃夏, 徐玉静, 李洁, 等. 遗传禁忌搜索算法收敛性和时间复杂度分析[J]. 河南理工大学学报(自然科学版), 2018, 37(4): 118-122.
[30]	MOU Naixia, XU Yujing, LI Jie, et al. Analyses of convergence and time complexity of genetic tabu search algorithm[J]. Journal of Henan Polytechnic University (Natural Science), 2018, 37(4): 118-122.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献