Automatic opinion extraction from short Hebrew texts using machine learning techniques

Dror Mughaz, Tzeviya Fuchs, Dan Bouhnik

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Sentiment analysis deals with classifying written texts according to their polarity. Previous research in this topic has been conducted mostly for Latin languages, and no research has been done for Hebrew. This is important because it turns out that the task of text classification is extremely language-dependent. Furthermore, the work on sentiment analysis for English texts was mostly performed on relatively long documents. In this work, we focus specifically on classifying Modern Hebrew sentences according to their polarity. We compare various Machine Learning algorithms and techniques of classification. We added optimizations and methods that have not previously been used, and adjusted commonly used techniques so they would suit a Hebrew corpus. We elaborate on the differences in classifying short texts versus long ones and about the uniqueness of working specifically with Hebrew. Finally, our model achieved nearly 93% accuracy, which is higher than accuracies achieved previously in this field.

Original languageEnglish
Pages (from-to)1347-1357
Number of pages11
JournalComputacion y Sistemas
Volume22
Issue number4
DOIs
StatePublished - 2018

Bibliographical note

Publisher Copyright:
© 2018 Instituto Politecnico Nacional. All rights reserved.

Keywords

  • Keyword. Automatic classification
  • Machine learning
  • Sentiment analysis
  • Short Hebrew texts

Fingerprint

Dive into the research topics of 'Automatic opinion extraction from short Hebrew texts using machine learning techniques'. Together they form a unique fingerprint.

Cite this