Distinguishing Between True and False Stories Using Various Linguistic Features

Yaakov HaCohen-Kerner, Rakefet Dilmon, Shimon Friedlich, Daniel Nissim Cohen

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

This paper analyzes what linguistic features differentiate true and false stories written in Hebrew. To do so, we have defined four feature sets containing 145 features: POS-Tags, quantitative, repetition, and special expressions. The examined corpus contains stories that were composed by 48 native Hebrew speakers who were asked to tell both false and true stories. Classification experiments on all possible combinations of these four feature sets using five supervised machine learning methods have been applied. The Part of Speech (POS) set was superior to all others and has been found as a key component. The best accuracy result (89.6%) has been achieved by a combination of sixteen POS-Tags and one quantitative feature.
Original languageEnglish
Title of host publicationProceedings of the 29th Pacific Asia Conference on Language, Information and Computation:
Subtitle of host publicationPosters
EditorsHai Zhao
Place of PublicationShanghai
PublisherPacific Asia Conference on Language,Information and Computation
Pages176-186
Number of pages11
StatePublished - 2015

Fingerprint

Dive into the research topics of 'Distinguishing Between True and False Stories Using Various Linguistic Features'. Together they form a unique fingerprint.

Cite this