Abstract

Advanced multi-agent systems are capable of executing tasks in complex and unknown domains that are unsuitable for humans. However, the design of multi-agent systems faces challenges in balancing global and local decision-making and enabling the system to adaptively generate desired behaviors in a dynamic environment. To address these challenges, in this article, we propose a multi-agent system design framework based on bilevel closed-loop planning. We use the multi-agent box-pushing problem as an example to verify the framework. Within this framework, the upper-level planning (which is used for box position prediction) and the lower-level planning (which is used for agent position allocation) are designed to connect and coordinate between the global and local decisions. The influence of states based on planning creates a closed-loop control mechanism with temporary targets as input, allowing the system to adapt to various environments. In this article, we use webots as the simulation platform to conduct multi-agent box-pushing experiments and compare the results with the rule-based method to demonstrate the effectiveness and advantages of our approach.

References

1.
Bartholdi
,
J. J.
,
Eisenstein
,
D. D.
, and
Lim
,
Y. F.
,
2010
, “
Self-organizing Logistics Systems
,”
Annu. Rev. Control
,
34
(
1
), pp.
111
117
.
2.
Takahashi
,
J.
,
Kanamori
,
R.
, and
Ito
,
T.
,
2012
, “
A Preliminary Study on Anticipatory Stigmergy for Traffic Management
,”
2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology
,
Macau, China
,
Dec. 4–7
, pp.
399
405
.
3.
Jiang
,
M.
,
Ming
,
Z.
,
Li
,
C.
,
Allen
,
J. K.
, and
Mistree
,
F.
,
2024
, “
Design of Self-organizing Systems Using Multi-agent Reinforcement Learning and the Compromise Decision Support Problem Construct
,”
ASME J. Mech. Des.
,
146
(
5
), p.
051711
.
4.
Ming
,
Z.
,
Luo
,
Y.
,
Wang
,
G.
,
Yan
,
Y.
,
Allen
,
J. K.
, and
Mistree
,
F.
,
2024
, “
Designing Self-organizing Systems Using Surrogate Models and the Compromise Decision Support Problem Construct
,”
Adv. Eng. Inf.
,
59
(
/
), p.
102350
.
5.
Lei
,
Y.
, and
Junxing
,
Z.
,
2017
, “
Service Composition Based on Multi-agent in the Cooperative Game
,”
Future Gener. Comput. Syst.
,
68
(
/
), pp.
128
135
.
6.
Ma
,
Q.
,
2017
, “
Cooperative Control of Multi-agent Systems With Unknown Control Directions
,”
Appl. Math. Comput.
,
292
(
/
), pp.
240
252
.
7.
Nguyen
,
N. D.
,
Nguyen
,
T.
, and
Nahavandi
,
S.
,
2019
, “
Multi-agent Behavioral Control System Using Deep Reinforcement Learning
,”
Neurocomputing
,
359
, pp.
58
68
.
8.
Flocchini
,
P.
,
Prencipe
,
G.
,
Santoro
,
N.
, and
Widmayer
,
P.
,
2000
, “
Distributed Coordination of a Set of Autonomous Mobile Robots
,”
Proceedings of the IEEE Intelligent Vehicles Symposium 2000 (Cat. No. 00TH8511)
,
Dearborn, MI
,
Oct. 3–5
, pp.
480
485
.
9.
Ghosh
,
A.
,
Ghosh
,
A.
,
Konar
,
A.
, and
Janarthanan
,
R.
,
2012
, “
Multi-robot Cooperative Box-Pushing Problem Using Multi-objective Particle Swarm Optimization Technique
,”
2012 World Congress on Information and Communication Technologies
,
Trivandrum, India
,
Oct. 30 – Nov. 2
, pp.
272
277
.
10.
Parra-González
,
E.
, and
Ramírez-Torres
,
G.
,
2008
, “
Cooperative Multi-robot Box-Pushing in a Cluttered Environment
,”
Electronics, Robotics and Automotive Mechanics Conference (CERMA’08)
,
Cuernavaca, Morelos, México
,
Sept. 30–Oct. 3
, pp.
514
519
.
11.
Lujak
,
M.
,
2010
, “
A Distributed Coordination Model for Multi-robot Box Pushing
,”
IFAC Proc.
,
43
(
4
), pp.
120
125
.
12.
Zemzem
,
W.
, and
Tagina
,
M.
,
2018
, “
Cooperative Multi-agent Systems Using Distributed Reinforcement Learning Techniques
,”
Procedia Comput. Sci.
,
126
(
/
), pp.
517
526
.
13.
Lowe
,
R.
,
Wu
,
Y.
,
Tamar
,
A.
,
Harb
,
J.
,
Abbeel
,
P.
, and
Mordatch
,
I.
,
2017
, “
Multi-agent Actor-Critic for Mixed Cooperative-Competitive Environments
,”
31st International Conference on Neural Information Processing Systems
,
Long Beach, CA
,
Dec. 4–9
, pp.
6382
6393
.
14.
Sunehag
,
P.
,
Lever
,
G.
,
Gruslys
,
A.
,
Czarnecki
,
W. M.
,
Zambaldi
,
V.
,
Jaderberg
,
M.
,
Lanctot
,
M.
, et al.
,
2018
, “
Value-Decomposition Networks for Cooperative Multi-agent Learning Based on Team Reward
,”
17th International Conference on Autonomous Agents and Multiagent Systems
,
Stockholm, Sweden
,
July 10–15
, pp.
2085
2087
.
15.
Rashid
,
T.
,
Samvelyan
,
M.
,
Witt
,
C. S. D.
,
Farquhar
,
G.
,
Foerster
,
J.
, and
Whiteson
,
S.
,
2020
, “
Monotonic Value Function Factorisation for Deep Multi-agent Reinforcement Learning
,”
J. Mach. Learn. Res.
,
21
(
1
), pp.
7234
7284
.
16.
Khani
,
N.
,
Humann
,
J.
, and
Jin
,
Y.
,
2016
, “
Effect of Social Structuring in Self-organizing Systems
,”
ASME J. Mech. Des.
,
138
(
4
), p.
041101
.
17.
Huang
,
B.
, and
Jin
,
Y.
,
2022
, “
Reward Shaping in Multiagent Reinforcement Learning for Self-organizing Systems in Assembly Tasks
,”
Adv. Eng. Inf.
,
54
(
/
), p.
101800
.
18.
Huang
,
B.
, and
Jin
,
Y.
,
2023
, “
Social Learning in Self-organizing Systems for Complex Assembly Tasks
,”
Adv. Eng. Inform.
,
57
(
/
), p.
102109
.
19.
Peters
,
J.
, and
Schaal
,
S.
,
2008
, “
Natural Actor-Critic
,”
Neurocomputing
,
71
(
7
), pp.
1180
1190
.
20.
Busoniu
,
L.
,
Babuska
,
R.
, and
Schutter
,
B. D.
,
2008
, “
A Comprehensive Survey of Multiagent Reinforcement Learning
,”
IEEE Trans. Syst. Man Cybern. Part C Appl. Rev.
,
38
(
2
), pp.
156
172
.
21.
Mnih
,
V.
,
Kavukcuoglu
,
K.
,
Silver
,
D.
,
Rusu
,
A. A.
,
Veness
,
J.
,
Bellemare
,
M. G.
,
Graves
,
A.
, et al
,
2015
, “
Human-Level Control Through Deep Reinforcement Learning
,”
Nature
,
518
(
7540
), pp.
529
533
.
22.
Wang
,
C.
,
Wang
,
J.
,
Shen
,
Y.
, and
Zhang
,
X.
,
2019
, “
Autonomous Navigation of UAVs in Large-Scale Complex Environments: A Deep Reinforcement Learning Approach
,”
IEEE Trans. Veh. Technol.
,
68
(
3
), pp.
2124
2136
.
23.
Chen
,
C.
, and
Jin
,
Y.
,
2011
, “
A Behavior Based Approach to Cellular Self-organizing Systems Design
,”
International Design Technical Conferences
, Paper No. DETC2011-48833.
24.
Chiang
,
W.
, and
Jin
,
Y.
,
2012
, “
Design of Cellular Self-organizing Systems
,”
International Design Technical Conferences
, Paper No. DETC2012-71216.
25.
Jin
,
Y.
, and
Chen
,
C.
,
2014
, “
Cellular Self-organizing Systems: A Field-Based Behavior Regulation Approach
,”
Artif. Intell. Eng. Des. Anal. Manuf.
,
28
(
2
), pp.
115
128
.
26.
Jin
,
Y.
, and
Chen
,
C.
,
2014
, “Field Based Behavior Regulation for Self-organization in Cellular Systems,”
Design Computing and Cognition, ‘12
,
J. S.
Gero
, ed.,
Springer
,
Dordrecht
, pp.
605
623
.
27.
Petersen
,
K. H.
,
Napp
,
N.
,
Stuart-Smith
,
R.
,
Rus
,
D.
, and
Kovac
,
M.
,
2019
, “
A Review of Collective Robotic Construction
,”
Sci. Robot.
,
4
(
28
), p.
eaau8479
.
28.
Werfel
,
J.
,
Petersen
,
K.
, and
Nagpal
,
R.
,
2014
, “
Designing Collective Behavior in a Termite-Inspired Robot Construction Team
,”
Science
,
343
(
6172
), pp.
754
758
.
29.
Castellano
,
G.
,
Cimino
,
M. G. C. A.
,
Fanelli
,
A. M.
,
Lazzerini
,
B.
,
Marcelloni
,
F.
, and
Torsello
,
M. A.
,
2014
, “
A Multi-agent System for Enabling Collaborative Situation Awareness Via Position-Based Stigmergy and Neuro-Fuzzy Learning
,”
Neurocomputing
,
135
(
/
), pp.
86
97
.
30.
D’Aniello
,
G.
,
Loia
,
V.
, and
Orciuoli
,
F.
,
2015
, “
A Multi-agent Fuzzy Consensus Model in a Situation Awareness Framework
,”
Appl. Soft Comput.
,
30
(
/
), pp.
430
440
.
31.
Cao
,
J.
,
Ming
,
Z.
,
Allen
,
J. K.
, and
Mistree
,
F.
,
2023
, “
On How a Self-organizing System Produces Collective Behavior
,”
International Design Engineering Technical Conferences and Computers and Information in Engineering Conference
,
Boston MA
,
Aug. 20–23
.
32.
Giles
,
K.
, and
Giammarco
,
K.
,
2017
, “
Mission-Based Architecture for Swarm Composability (MASC)
,”
Procedia Comput. Sci.
,
114
(
/
), pp.
57
64
.
33.
Giles
,
K.
, and
Giammarco
,
K.
,
2019
, “
A Mission-Based Architecture for Swarm Unmanned Systems
,”
Syst. Eng.
,
22
(
3
), pp.
271
281
.
34.
Fox
,
D.
,
Burgard
,
W.
, and
Thrun
,
S.
,
1997
, “
The Dynamic Window Approach to Collision Avoidance
,”
IEEE Robot. Autom. Mag.
,
4
(
1
), pp.
23
33
.
35.
Weber
,
T.
,
Racanière
,
S.
,
Reichert
,
D.
,
Buesing
,
L.
,
Guez
,
A.
,
Jimenez Rezende
,
D.
,
Badia
,
A.
, et al
,
2017
, “
Imagination-Augmented Agents for Deep Reinforcement Learning
,”
Advances in Neural Information Processing Systems 30
,
Long Beach, CA
,
Dec. 4–9
.
36.
Kirtas
,
M.
,
Tsampazis
,
K.
,
Passalis
,
N.
, and
Tefas
,
A.
,
2020
, “Deepbots: A Webots-Based Deep Reinforcement Learning Framework for Robotics,”
Proc. Artificial Intelligence Applications and Innovations
,
I.
Maglogiannis
,
L.
Iliadis
,
E.
Pimenidis
, eds.,
Springer International Publishing
,
Cham Switzerland
, pp.
64
75
.
37.
Hart
,
P. E.
,
Nilsson
,
N. J.
, and
Raphael
,
B.
,
1968
, “
A Formal Basis for the Heuristic Determination of Minimum Cost Paths
,”
IEEE Trans. Syst. Sci. Cybern.
,
4
(
2
), pp.
100
107
.
You do not currently have access to this content.