Abstract
Logs produced by software applications are invaluable for spotting deviations from expected system behavior. However, automatically detecting anomalies from log data is challenging due to the volume, semi-structured nature, lack of standard formatting, and potential evolution of log records over time. In this work, we approach log-based anomaly detection as a semantic similarity problem. We generate pairwise similarity scores using a general-purpose pre-trained language model and further augment them with ground-truth binary labels. The generated similarity labels supervise an encoder trained for semantic similarity. At inference time, anomalies are detected based on the cosine similarity between the encoded query sequence and the average normal encoding. Our method outperforms contemporary techniques on multiple benchmarks without template extraction or a fixed vocabulary and achieves competitive performance even when provided with limited abnormal examples.
Original language | English |
---|---|
Title of host publication | Proceedings - 2024 39th ACM/IEEE International Conference on Automated Software Engineering, ASE 2024 |
Publisher | Association for Computing Machinery, Inc |
Pages | 2438-2439 |
Number of pages | 2 |
ISBN (Electronic) | 9798400712487 |
DOIs | |
State | Published - 27 Oct 2024 |
Externally published | Yes |
Event | 39th ACM/IEEE International Conference on Automated Software Engineering, ASE 2024 - Sacramento, United States Duration: 28 Oct 2024 → 1 Nov 2024 |
Publication series
Name | Proceedings - 2024 39th ACM/IEEE International Conference on Automated Software Engineering, ASE 2024 |
---|
Conference
Conference | 39th ACM/IEEE International Conference on Automated Software Engineering, ASE 2024 |
---|---|
Country/Territory | United States |
City | Sacramento |
Period | 28/10/24 → 1/11/24 |
Bibliographical note
Publisher Copyright:© 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.