Escape with Self-adaptive Decision Radius based on Deep Deterministic Policy Gradient in Pursuit Games

Xiaojie Zhou, Chunxi Yang, Wenbo Wang, Pengqi Sun

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The Pursuit-Evasion (PE) game of Unmanned Surface Vehicles (USVs) is a classic antagonistic problem for the intelligent agent system. To enhance the escaping success rate of evaders with better effort, this paper proposes an escape strategy based on the geometrical characteristics of Apollonius circles. An improved self-adaptive escaping strategy for the evader utilizing the deep deterministic policy gradient algorithm is proposed. Then, the criteria for successful encirclement by pursuer are given. A DDPG algorithm-based framework is proposed on the basis of markov decision process formulation. Specifically, an action space based on adaptive decision radius of evader is designed. Our simulation shows the proposed method has more advantages in terms of escape distance and escape time.

Original languageEnglish
Title of host publicationProceedings of 2024 IEEE 13th Data Driven Control and Learning Systems Conference, DDCLS 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages496-502
Number of pages7
ISBN (Electronic)9798350361674
DOIs
StatePublished - 2024
Externally publishedYes
Event13th IEEE Data Driven Control and Learning Systems Conference, DDCLS 2024 - Kaifeng, China
Duration: 17 May 202419 May 2024

Publication series

NameProceedings of 2024 IEEE 13th Data Driven Control and Learning Systems Conference, DDCLS 2024

Conference

Conference13th IEEE Data Driven Control and Learning Systems Conference, DDCLS 2024
Country/TerritoryChina
CityKaifeng
Period17/05/2419/05/24

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Keywords

  • Apollonios circle
  • DDPG
  • Self-adaptive escape strategy
  • pursuit-evasion game

Fingerprint

Dive into the research topics of 'Escape with Self-adaptive Decision Radius based on Deep Deterministic Policy Gradient in Pursuit Games'. Together they form a unique fingerprint.

Cite this