Abstract
We propose a method to enhance the robustness and efficiency of deep reinforcement learning (DRL) by integrating formal verification techniques into the training loop. Our approach uses counterexamples generated by verification tools as corrective feedback to guide policy adjustments, enabling the agent to avoid unsafe actions and learn faster. Inspired by imitation learning, the verification tool acts as an expert that continuously refines the neural network when the agent’s policy fails. Experiments in challenging environments such as Frozen Lake and Sokoban demonstrate that our method yields substantial improvements in success rates and reduces the number of training episodes by up to 70%, all while significantly enhancing policy safety. We release the code and full reproducibility instructions at https://github.com/AvrahamRaviv/Robust-DRL-FV.
| Original language | English |
|---|---|
| Title of host publication | Theoretical Aspects of Software Engineering - 19th International Symposium, TASE 2025, Proceedings |
| Editors | Philipp Rümmer, Zhilin Wu |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 179-196 |
| Number of pages | 18 |
| ISBN (Print) | 9783031982071 |
| DOIs | |
| State | Published - 2026 |
| Event | 19th International Symposium on Theoretical Aspects of Software Engineering, TASE 2025 - Limassol, Cyprus Duration: 14 Jul 2025 → 16 Jul 2025 |
Publication series
| Name | Lecture Notes in Computer Science |
|---|---|
| Volume | 15841 LNCS |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 19th International Symposium on Theoretical Aspects of Software Engineering, TASE 2025 |
|---|---|
| Country/Territory | Cyprus |
| City | Limassol |
| Period | 14/07/25 → 16/07/25 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.