Robust Deep Reinforcement Learning Using Formal Verification

  • Avraham Raviv
  • , Shaiel Vistuch
  • , Boaz Gurevich
  • , Erel Dekel
  • , Hillel Kugler

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We propose a method to enhance the robustness and efficiency of deep reinforcement learning (DRL) by integrating formal verification techniques into the training loop. Our approach uses counterexamples generated by verification tools as corrective feedback to guide policy adjustments, enabling the agent to avoid unsafe actions and learn faster. Inspired by imitation learning, the verification tool acts as an expert that continuously refines the neural network when the agent’s policy fails. Experiments in challenging environments such as Frozen Lake and Sokoban demonstrate that our method yields substantial improvements in success rates and reduces the number of training episodes by up to 70%, all while significantly enhancing policy safety. We release the code and full reproducibility instructions at https://github.com/AvrahamRaviv/Robust-DRL-FV.

Original languageEnglish
Title of host publicationTheoretical Aspects of Software Engineering - 19th International Symposium, TASE 2025, Proceedings
EditorsPhilipp Rümmer, Zhilin Wu
PublisherSpringer Science and Business Media Deutschland GmbH
Pages179-196
Number of pages18
ISBN (Print)9783031982071
DOIs
StatePublished - 2026
Event19th International Symposium on Theoretical Aspects of Software Engineering, TASE 2025 - Limassol, Cyprus
Duration: 14 Jul 202516 Jul 2025

Publication series

NameLecture Notes in Computer Science
Volume15841 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Symposium on Theoretical Aspects of Software Engineering, TASE 2025
Country/TerritoryCyprus
CityLimassol
Period14/07/2516/07/25

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

Fingerprint

Dive into the research topics of 'Robust Deep Reinforcement Learning Using Formal Verification'. Together they form a unique fingerprint.

Cite this