Improving severity classification of Hebrew PET-CT pathology reports using test-time augmentation

Seffi Cohen, Edo Lior, Moshe Bocher, Lior Rokach

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Classifying medical reports written in Hebrew is challenging due to the ambiguity and complexity of the language. This study proposes Text Test Time Augmentation (TTTA), a novel method to improve the classification accuracy of cancer severity levels from PET-CT diagnostic reports in Hebrew. Hebrew, being a morphologically rich language, often leads to each word having multiple ambiguous interpretations. TTTA leverages test-time augmentation to enhance text information retrieval and model robustness. During training and testing phases, this method generates and evaluates sets of augmentations to enhance the semantics extracted from each report. Experiments utilize a large institutional report repository from Ziv hospital, Israel, where physicians manually labeled the reports. The results demonstrate that the proposed TTTA approach achieves superior performance over baseline models without TTA, improving PR-AUC by 15.18% on classifying cancer severity levels. The study highlights the efficacy of TTTA in extracting essential medical concepts from free text reports and accurately classifying the severity of cancer. The approach addresses the limitations of prior methods and contributes towards improved automated analysis of Hebrew medical reports. TTTA has the potential to assist physicians in cancer diagnosis and treatment planning.

Original languageEnglish
Article number104577
JournalJournal of Biomedical Informatics
StatePublished - Jan 2024

Bibliographical note

Publisher Copyright:
© 2023


  • NLPH
  • PET-CT
  • Reports-classification
  • TTA


Dive into the research topics of 'Improving severity classification of Hebrew PET-CT pathology reports using test-time augmentation'. Together they form a unique fingerprint.

Cite this