Detecting Fake News in URDU using Classical Supervised Machine Learning Methods and Word/Char N-grams

Yaakov HaCohen-Kerner, Natan Manor, Netanel Bashan, Elyasaf Dimant

Research output: Contribution to journalConference articlepeer-review

Abstract

In this paper, we describe our submissions for the UrduFake 2021 track. We tackled the task entitled “Fake News Detection in the Urdu Language". We developed different models using three classical supervised machine learning methods: Support Vector Classifier, Random Forest, and Logistic Regression. Our machine learning models were applied to various sets of character or word n-gram features. Our best submission was an SVC model using 7,500 char trigrams. This model was ranked in 11th place out of 34 teams that participated in the discussed track.

Original languageEnglish
Pages (from-to)1162-1167
Number of pages6
JournalCEUR Workshop Proceedings
Volume3159
StatePublished - 2021
Externally publishedYes
EventWorking Notes of FIRE - 13th Forum for Information Retrieval Evaluation, FIRE-WN 2021 - Gandhinagar, India
Duration: 13 Dec 202117 Dec 2021

Bibliographical note

Publisher Copyright:
© 2021 Copyright for this paper by the Forum for Information Retrieval Evaluation, December 13-17, 2021, India.

Keywords

  • Fake news
  • supervised machine learning
  • word/char n-grams

Fingerprint

Dive into the research topics of 'Detecting Fake News in URDU using Classical Supervised Machine Learning Methods and Word/Char N-grams'. Together they form a unique fingerprint.

Cite this