Automatic classification of complaint letters according to service provider categories

Yaakov HaCohen-Kerner, Rakefet Dilmon, Maor Hone, Matanya Aharon Ben-Basan

Research output: Contribution to journalArticlepeer-review

15 Scopus citations


In the technological age, the phenomenon of complaint letters published on the Internet is increasing. Therefore, it is important to automatically classify complaint letters according to various criteria, such as company categories. In this research, we investigated the automatic text classification of complaint letters written in Hebrew that were sent to various companies from a wide variety of categories. The classification was performed according to company categories such as insurance, cellular communication, and rental cars. We conducted an extensive set of classification experiments of complaint letters to seven/six/five/four company categories. The classification experiments were performed using various sets of word unigrams, four machine learning methods, two feature filtering methods, and parameter tuning. The classification results are relatively high for all six measures: accuracy, precision, recall, F1, PRC-area, and ROC-area. The best accuracy results for seven, six, five, and four categories are 84.5%, 88.4%, 91.4%, and 93.8%, respectively. An analysis of the most frequently occurring words in the complaints about almost all categories revealed that the most significant issues were related to poor service and delayed delivery. An interesting result shows that only in the domain of hospitals was the subject of the domain itself (i.e., the patient, the medical treatment, the place of the treatment, and the medical staff) the most important issue. Another interesting finding is that the issue of “price” was of little or no importance to the complainants. These findings suggest that in their preoccupation with their bottom line of profitability, many service providers are blind to how paramount good service and timely delivery (and, in the case of hospitals, the domain itself) are to their clientele.

Original languageEnglish
Article number102102
JournalInformation Processing and Management
Issue number6
StatePublished - Nov 2019

Bibliographical note

Publisher Copyright:
© 2019 Elsevier Ltd


  • Bag of words
  • Complaint letters
  • Semantic fields
  • Service providers
  • Supervised machine learning
  • Text classification


Dive into the research topics of 'Automatic classification of complaint letters according to service provider categories'. Together they form a unique fingerprint.

Cite this