Extracting automata from recurrent neural networks using queries and counterexamples (extended version)

Gail Weiss, Yoav Goldberg, Eran Yahav

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

We consider the problem of extracting a deterministic finite automaton (DFA) from a trained recurrent neural network (RNN). We present a novel algorithm that uses exact learning and abstract interpretation to perform efficient extraction of a minimal DFA describing the state dynamics of a given RNN. We use Angluin’s L algorithm as a learner and the given RNN as an oracle, refining the abstraction of the RNN only as much as necessary for answering equivalence queries. Our technique allows DFA-extraction from the RNN while avoiding state explosion, even when the state vectors are large and fine differentiation is required between RNN states. We experiment on multi-layer GRUs and LSTMs with state-vector dimensions, alphabet sizes, and underlying DFA which are significantly larger than in previous DFA-extraction work. Aditionally, we discuss when it may be relevant to apply the technique to RNNs trained as language models rather than binary classifiers, and present experiments on some such examples. In some of our experiments, the underlying target language can be described with a succinct DFA, yet we find that the extracted DFA is large and complex. These are cases in which the RNN has failed to learn the intended generalisation, and our extraction procedure highlights words which are misclassified by the seemingly “perfect” RNN.

Original languageEnglish
Pages (from-to)2877-2919
Number of pages43
JournalMachine Learning
Volume113
Issue number5
DOIs
StatePublished - May 2024

Bibliographical note

Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature.

Funding

The authors thank Rémi Eyraud, Xiaokun Luan, and the anonymous reviewers for their constructive comments. The research leading to the results presented in this paper is supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 802774 (iEXTRACT).

FundersFunder number
Horizon 2020 Framework Programme802774
European Commission

    Keywords

    • Automata
    • Deterministic finite automata
    • Exact learning
    • Extraction
    • Recurrent neural networks

    Fingerprint

    Dive into the research topics of 'Extracting automata from recurrent neural networks using queries and counterexamples (extended version)'. Together they form a unique fingerprint.

    Cite this