Distinguishing between true and false stories using various linguistic features

Yaakov HaCohen-Kerner, Rakefet Dilmon, Shimon Friedlich, Daniel Nissim Cohen

Research output: Contribution to conferencePaperpeer-review

1 Scopus citations

Abstract

This paper analyzes what linguistic features differentiate true and false stories written in Hebrew. To do so, we have defined four feature sets containing 145 features: POS-Tags, quantitative, repetition, and special expressions. The examined corpus contains stories that were composed by 48 native Hebrew speakers who were asked to tell both false and true stories. Classification experiments on all possible combinations of these four feature sets using five supervised machine learning methods have been applied. The Part of Speech (POS) set was superior to all others and has been found as a key component. The best accuracy result (89.6%) has been achieved by a combination of sixteen POS-Tags and one quantitative feature.

Original languageEnglish
Pages176-186
Number of pages11
StatePublished - 2015
Event29th Pacific Asia Conference on Language, Information and Computation, PACLIC 2015 - Shanghai, China
Duration: 30 Oct 20151 Nov 2015

Conference

Conference29th Pacific Asia Conference on Language, Information and Computation, PACLIC 2015
Country/TerritoryChina
CityShanghai
Period30/10/151/11/15

Fingerprint

Dive into the research topics of 'Distinguishing between true and false stories using various linguistic features'. Together they form a unique fingerprint.

Cite this