A Dual-System Reinforcement Learning Method for Flexible Job Shop Dynamic Scheduling

doi:10.16183/j.cnki.jsjtu.2021.215

Abstract

Abstract:

In the production process of aerospace structural parts, there coexist batch production tasks and research and development (R&D) tasks. Personalized small-batch R&D and production tasks lead to frequent emergency insertion orders. In order to ensure that the task is completed on schedule and to solve the flexible job shop dynamic scheduling problem, this paper takes minimization of equipment average load and total completion time as optimization goals, and proposes a dual-loop deep Q network (DL-DQN) method driven by a perception-cognition dual system. Based on the knowledge graph, the perception system realizes the representation of workshop knowledge and the generation of multi-dimensional information matrix. The cognitive system abstracts the scheduling process into two stages: resource allocation agent and process sequencing agent, corresponding to two optimization goals respectively. The workshop status matrix is designed to describe the problems and constraints. In scheduling decision, action instructions are introduced step by step. Finally, the reward function is designed to realize the evaluation of resource allocation decision and process sequence decision. Application of the proposed method in the aerospace shell processing of an aerospace institute and comparative analysis of different algorithms verify the superiority of the proposed method.

Key words: perception-cognition dual system, dual-loop deep Q network (DL-DQN), dynamic scheduling, knowledge graph, multi-agent

CLC Number:

TP301

LIU Yahui, SHEN Xingwang, GU Xinghai, PENG Tao, BAO Jinsong, ZHANG Dan. A Dual-System Reinforcement Learning Method for Flexible Job Shop Dynamic Scheduling[J]. Journal of Shanghai Jiao Tong University, 2022, 56(9): 1262-1275.

Figures/Tables 20

Tab.1

Symbols and variables

符号	符号描述
J	工件集合
G	设备组集合
M	设备集合
P	人员集合
S	物料集合
o_k_,_i_,_j	第k个任务中工件J_i的第j道工序
j	工序索引j=1, 2, …, m
R_k_,_i_,_j	第k个任务中工件J_i的第j道工序配置资源,R_k_,_i_,_j={M_k_,_i_,_j, P_k_,_i_,_j, S_k_,_i_,_j}
M_k_,_i_,_j	工序o_k_,_i_,_j的配置设备
P_k_,_i_,_j	工序o_k_,_i_,_j的配置操作人员
S_k_,_i_,_j	工序o_k_,_i_,_j的配置物料
$S k, i T$	第k个任务中工件J_i的到达时间
$s o k, i, j t$	工序o_k_,_i_,_j的开始时间
$e o k, i, j t$	工序o_k_,_i_,_j的结束时间
$r o k, i, j t$	工序o_k_,_i_,_j与下道工序的准备时间
$B k, i T$	第k个任务中工件J_i的总加工时间
W_l_,_t(W_k_,_i)	设备组G_l中设备M_t的加工负载(以设备为目标计算得到W_l_,_t,以工序为单位计算得到W_k_,_i)
$L l, t W$	设备组G_l中设备M_t的最大加工负载
$B M k, i, j J k, i$	0-1决策变量,取1时表示第k个任务中工件J_i在设备M_i_,_j上加工
D^P	交付期

Tab.1

Fig.1

Fig.2

Fig.3

Tab.2

Parameters of scheduling status

调度状态	参数类型	表达式	含义
资源配置状态	人员状态 $f 1, 1 1$	$f 1,1, 1 1$ =P_k_,_i_,_j	人员类型
	物料状态 $f 1, 2 1$	$f 1, 2, 1 1$ =S_k_,_i_,_j	物料类型
	设备状态 $f 1,3 1$	$f 1, 3, 1 1$ =M_k_,_i_,_j	设备类型
		$f 1, 3, 2 1 = B M k, i, j J k, i = 0, 批产任务 k 中的工件 i 不在设备 M k, i, j 上加工 1, 批产任务 k 中的工件 i 在设备 M k, i, j 上加工$ (M_k_,_i_,_j∈M)	设备状态
工序排序状态	工艺状态 $f 1, 1 2$	$f 1, 1, 1 2$ =o_k_,_i_,_j	工艺状态
		$f 1, 1, 2 2$ =G_k_,_i_,_j	设备组状态
	时间状态 $f 1, 2 2$	$f 1, 2, 1 2$ = $s o k, i, j t$	开始时间
		$f 1, 2, 2 2$ = $e o k, i, j t$	结束时间
		$f 1, 2, 3 2$ = $r o k, i, j t$	运输时间
		$f 1, 2, 4 2$ = $D P k$	交付期

Tab.2

Tab.3

Tab.4

Decision-making action of process sequencing

符号	描述	量化方式
FIFO	先到先加工优先规则	$a t 2 = m i n r k, i (r k, i 为释放时间)$
SPT	工序加工时间最短优先规则	$a t 2 = m i n ∑ k = 1 K ∑ i = 1 n (e o k, i, j t - s o k, i, j t + r o k, i, j t)$
EDD	交货期最早加工优先规则	$a t 2 = m i n D P k$
SL	松弛时间最短优先规则	$a t 2 = m i n D P k - x - ∑ j = 1 m B k, i T (x 为当前时间)$
SRPT	剩余加工时间最长优先规则	$a t 2 = m a x ∑ j = j' m B k, i T (j' 为当前工序)$

Tab.4

Tab.5

Tab.6

Fig.4

Fig.5

Tab.7

Tab.8

Tab.9

Fig.6

Tab.10

Fig.7

Fig.8

Fig.9

Tab.11

References 30

[1]	JI S X, PAN S R, CAMBRIA E, et al. A survey on knowledge graphs: Representation, acquisition, and applications[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(2): 494-514. doi: 10.1109/TNNLS.2021.3070843 URL
[2]	ISSA S, ADEKUNLE O, HAMDI F, et al. Knowledge graph completeness: A systematic literature review[J]. IEEE Access, 2021, 9: 31322-31339. doi: 10.1109/ACCESS.2021.3056622 URL
[3]	CARVALHO A, CHOUCHENE A, LIMA T, et al. Cognitive manufacturing in industry 4.0 toward cognitive load reduction: A conceptual framework[J]. Applied System Innovation, 2020, 3(4): 55. doi: 10.3390/asi3040055 URL
[4]	LU Y Q, XU X, WANG L H. Smart manufacturing process and system automation: A critical review of the standards and envisioned scenarios[J]. Journal of Manufacturing Systems, 2020, 56: 312-325. doi: 10.1016/j.jmsy.2020.06.010 URL
[5]	ZHANG J, DING G F, ZOU Y S, et al. Review of job shop scheduling research and its new perspectives under Industry 4.0[J]. Journal of Intelligent Manufacturing, 2019, 30(4): 1809-1830. doi: 10.1007/s10845-017-1350-2 URL
[6]	巴智勇, 袁逸萍, 戴毅, 等. 考虑机器故障的作业车间调度方案鲁棒测度方法[J]. 计算机集成制造系统, 2020, 26(12): 3341-3349.
	BA Zhiyong, YUAN Yiping, DAI Yi, et al. Robustness measurement approach of job shop scheduling with machine breakdowns[J]. Computer Integrated Manufacturing Systems, 2020, 26(12): 3341-3349.
[7]	方佳, 陆志强. 考虑设备故障的鲁棒调度计划模板的建模优化[J]. 上海交通大学学报, 2020, 54(12): 1278-1290.
	FANG Jia, LU Zhiqiang. Modeling and optimization of robust scheduling template considering equipment failure[J]. Journal of Shanghai Jiao Tong University, 2020, 54(12): 1278-1290.
[8]	GAO K Z, YANG F J, ZHOU M C, et al. Flexible job-shop rescheduling for new job insertion by using discrete jaya algorithm[J]. IEEE Transactions on Cybernetics, 2019, 49(5): 1944-1955. doi: 10.1109/TCYB.2018.2817240 pmid: 29993706
[9]	王鹏飞. 群智能优化算法及在流水车间调度问题中的应用研究[D]. 长春: 吉林大学, 2019.
	WANG Pengfei. Swarm intelligence optimization algorithm and its application in flow shop scheduling problem[D]. Changchun: Jilin University, 2019.
[10]	ZHOU B, BAO J S, LI J, et al. A novel knowledge graph-based optimization approach for resource allocation in discrete manufacturing workshops[J]. Robotics and Computer-Integrated Manufacturing, 2021, 71: 102160.
[11]	CHAKRABORTTY R K, RAHMAN H F, RYAN M J. Efficient priority rules for project scheduling under dynamic environments: A heuristic approach[J]. Computers & Industrial Engineering, 2020, 140: 106287.
[12]	蒋小康, 张朋, 吕佑龙, 等. 基于混合蚁群算法的半导体生产线炉管区调度方法[J]. 上海交通大学学报, 2020, 54(8): 792-804.
	JIANG Xiaokang, ZHANG Peng, LYU Youlong, et al. Hybrid ant colony algorithm for batch scheduling in semiconductor furnace operation[J]. Journal of Shanghai Jiao Tong University, 2020, 54(8): 792-804.
[13]	王金凤, 陈璐, 杨雯慧. 考虑设备可用性约束的单机调度问题[J]. 上海交通大学学报, 2021, 55(1): 103-110.
	WANG Jinfeng, CHEN Lu, YANG Wenhui. A single machine scheduling problem considering machine availability constraints[J]. Journal of Shanghai Jiao Tong University, 2021, 55(1): 103-110.
[14]	杜轩, 潘志成. 聚类差分进化算法求解多目标工艺规划与调度集成问题[J]. 计算机集成制造系统, 2019, 25(7): 1729-1738.
	DU Xuan, PAN Zhicheng. Clustering and differential evolution algorithm for solving multi-objectives IPPS problem[J]. Computer Integrated Manufacturing Systems, 2019, 25(7): 1729-1738.
[15]	李聪波, 沈欢, 李玲玲, 等. 面向能耗的多工艺路线柔性作业车间分批优化调度模型[J]. 机械工程学报, 2017, 53(5): 12-23. doi: 10.3901/JME.2017.05.012
	LI Congbo, SHEN Huan, LI Lingling, et al. A batch splitting flexible job shop scheduling model for energy saving under alternative process plans[J]. Journal of Mechanical Engineering, 2017, 53(5): 12-23. doi: 10.3901/JME.2017.05.012
[16]	PENG C, WU G L, LIAO T W, et al. Research on multi-agent genetic algorithm based on tabu search for the job shop scheduling problem[J]. PLoS One, 2019, 14(9): e0223182.
[17]	KUNDAKCI N, KULAK O. Hybrid genetic algorithms for minimizing makespan in dynamic job shop scheduling problem[J]. Computers & Industrial Engineering, 2016, 96: 31-51. doi: 10.1016/j.cie.2016.03.011 URL
[18]	SHEN X N, YAO X. Mathematical modeling and multi-objective evolutionary algorithms applied to dynamic flexible job shop scheduling problems[J]. Information Sciences, 2015, 298: 198-224. doi: 10.1016/j.ins.2014.11.036 URL
[19]	WANG Z, ZHANG J H, YANG S X. An improved particle swarm optimization algorithm for dynamic job shop scheduling problems with random job arrivals[J]. Swarm and Evolutionary Computation, 2019, 51: 100594.
[20]	张洁, 张朋, 刘国宝. 基于两阶段蚁群算法的带非等效并行机的作业车间调度[J]. 机械工程学报, 2013, 49(6): 136-144.
	ZHANG Jie, ZHANG Peng, LIU Guobao. Two-stage ant colony algorithm based job shop scheduling with unrelated parallel machines[J]. Journal of Mechanical Engineering, 2013, 49(6): 136-144.
[21]	周亚勤, 杨长祺, 吕佑龙, 等. 双资源约束的航天结构件车间生产调度方法[J]. 机械工程学报, 2018, 54(9): 55-63.
	ZHOU Yaqin, YANG Changqi, LÜ Youlong, et al. Scheduling the production of aerospace structural parts with dual resource constraints[J]. Journal of Mechanical Engineering, 2018, 54(9): 55-63. doi: 10.3901/JME.2018.09.055
[22]	汪浩祥, 严洪森, 汪峥. 知识化制造环境中基于双层Q学习的航空发动机自适应装配调度[J]. 计算机集成制造系统, 2014, 20(12): 3000-3010.
	WANG Haoxiang, YAN Hongsen, WANG Zheng. Adaptive assembly scheduling of aero-engine based on double-layer Q-learning in knowledgeable manufacturing[J]. Computer Integrated Manufacturing Systems, 2014, 20(12): 3000-3010.
[23]	WEI Y, PAN L, LIU S J, et al. DRL-scheduling: An intelligent QoS-aware job scheduling framework for applications in clouds[J]. IEEE Access, 2018, 6: 55112-55125. doi: 10.1109/ACCESS.2018.2872674 URL
[24]	WANG Y D, LIU H, ZHENG W B, et al. Multi-objective workflow scheduling with deep-Q-network-based multi-agent reinforcement learning[J]. IEEE Access, 2019, 7: 39974-39982. doi: 10.1109/ACCESS.2019.2902846 URL
[25]	LUO S. Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning[J]. Applied Soft Computing, 2020, 91: 106208.
[26]	HE Z L, TRAN K P, THOMASSEY S, et al. Multi-objective optimization of the textile manufacturing process using deep-Q-network based multi-agent reinforcement learning[J]. Journal of Manufacturing Systems, 2022, 62: 939-949. doi: 10.1016/j.jmsy.2021.03.017 URL
[27]	林时敬, 徐安军, 刘成, 等. 基于深度强化学习的炼钢车间天车调度方法[J]. 中国冶金, 2021, 31(3): 37-43.
	LIN Shijing, XU Anjun, LIU Cheng, et al. Crane scheduling method in steelmaking workshop based on deep reinforcement learning[J]. China Metallurgy, 2021, 31(3): 37-43.
[28]	BRANDIMARTE P. Routing and scheduling in a flexible job shop by tabu search[J]. Annals of Operations Research, 1993, 41(3): 157-183. doi: 10.1007/BF02023073 URL
[29]	喻鹏, 张俊也, 李文璟, 等. 移动边缘网络中基于双深度Q学习的高能效资源分配方法[J]. 通信学报, 2020, 41(12): 148-161.
	YU Peng, ZHANG Junye, LI Wenjing, et al. Energy-efficient resource allocation method in mobile edge network based on double deep Q-learning[J]. Journal on Communications, 2020, 41(12): 148-161.
[30]	牟乃夏, 徐玉静, 李洁, 等. 遗传禁忌搜索算法收敛性和时间复杂度分析[J]. 河南理工大学学报(自然科学版), 2018, 37(4): 118-122.
	MOU Naixia, XU Yujing, LI Jie, et al. Analyses of convergence and time complexity of genetic tabu search algorithm[J]. Journal of Henan Polytechnic University (Natural Science), 2018, 37(4): 118-122.

决策方法	描述
决策1	若人员、设备当前工作环境中空闲且资源配料充足,选择某工件所需人员、设备、物料等资源,然后将其组合为资源配置方案.
决策2	若人员、设备当前工作环境中忙碌且资源配料不充足,分别判断人员已工作工时与技能参数、设备的负载量和资源的存储量,在满足设备负载量最优的情况下将其组合为资源配置方案.

算例	工件数目	工序数	设备数	工序加工时间/h
MK01	6	6	4	[1, 10]
MK02	6	6	6	[3, 8]
MK03	6	8	6	[2, 6]
MK04	8	6	4	[1, 10]
MK05	8	8	6	[2, 9]
MK06	10	6	4	[1, 10]
MK07	10	6	6	[3, 8]
MK08	10	8	6	[2, 6]
MK09	12	6	4	[1, 10]
MK10	12	8	6	[2, 9]

工序	设备组	设备组序列
固溶	固溶炉组	G₁
旋压	旋压机组	G₂
s退火	退火炉组	G₃
时效	时效炉组	G₄
粗加工/精加工	数控机床组/加工中心组	G₅
电子束组焊/激光焊	焊机组	G₆
油淬	油淬炉组	G₇
氮气淬	氮气淬炉组	G₈
回火	回火炉组	G₉

任务	工序1	工序2	工序3	工序4	工序5	工序6	工序7	工序8
J₁	固溶	旋压	时效	粗加工	焊接	氮气淬	回火	精加工
J₂	旋压	退火	粗加工	焊接	氮气淬	回火	精加工
J₃	固溶	旋压	粗加工	焊接	油淬	回火	精加工
J₄	固溶	旋压	退火	粗加工	焊接	油淬	回火	精加工
J₅	固溶	旋压	退火	粗加工	焊接	精加工
J₆	固溶	旋压	时效	粗加工	焊接	氮气淬	回火	精加工

设备类别	设备型号
普通卧式车床	CDZ6140-1, CDZ6140-2
	CD6140B-1
数控卧式车床	CK61200W-1
	CK6146A-1
	CK64160-1, CK64160-2, CK64160-3
	CK64250-1
CKD系列数控车床	CKD6163-1
	CKD6163K-1, CKD6163K-2
	CKD6140S-1
	CKD6180D-1, CKD6180D-2
管螺纹车床	QK1319A-1, QK1319A-2