Positive and Negative Sentiment Words in a Blog Corpus Written in Hebrew

Yaakov HaCohen-Kerner, Haim Badash

Research output: Contribution to journalConference articlepeer-review

9 Scopus citations

Abstract

In this research, given a corpus containing blog posts written in Hebrew and two seed sentiment lists, we analyze the positive and negative sentences included in the corpus, and special groups of words that are associated with the positive and negative seed words. We discovered many new negative words (around half of the top 50 words) but only one positive word. Among the top words that are associated with the positive seed words, we discovered various first-person and third-person pronouns. Intensifiers were found for both the positive and negative seed words. Most of the corpus' sentences are neutral. For the rest, the rate of positive sentences is above 80%. The sentiment scores of the top words that are associated with the positive words are significantly higher than those of the top words that are associated with the negative words. Our conclusions are as follows. Positive sentences more "refer to" the authors themselves (first-person pronouns and related words) and are also more general, e.g., more related to other people (third-person pronouns), while negative sentences are much more concentrated on negative things and therefore contain many new negative words. Israeli bloggers tend to use intensifiers in order to emphasize or even exaggerate their sentiment opinions (both positive and negative). These bloggers not only write much more positive sentences than negative sentences, but also write much longer positive sentences than negative sentences.

Original languageEnglish
Pages (from-to)733-743
Number of pages11
JournalProcedia Computer Science
Volume96
DOIs
StatePublished - 2016
Externally publishedYes
Event20th International Conference on Knowledge Based and Intelligent Information and Engineering Systems, KES 2016 - York, United Kingdom
Duration: 5 Sep 20167 Sep 2016

Bibliographical note

Publisher Copyright:
© 2016 The Authors. Published by Elsevier B.V.

Keywords

  • Blog corpus
  • Hebrew
  • Natural Language Processing
  • Negative words
  • Positive words
  • Seed lists
  • Sentiment

Fingerprint

Dive into the research topics of 'Positive and Negative Sentiment Words in a Blog Corpus Written in Hebrew'. Together they form a unique fingerprint.

Cite this