Abstract
In this paper, we describe our submissions for the UrduFake 2021 track. We tackled the task entitled “Fake News Detection in the Urdu Language". We developed different models using three classical supervised machine learning methods: Support Vector Classifier, Random Forest, and Logistic Regression. Our machine learning models were applied to various sets of character or word n-gram features. Our best submission was an SVC model using 7,500 char trigrams. This model was ranked in 11th place out of 34 teams that participated in the discussed track.
Original language | English |
---|---|
Pages (from-to) | 1162-1167 |
Number of pages | 6 |
Journal | CEUR Workshop Proceedings |
Volume | 3159 |
State | Published - 2021 |
Externally published | Yes |
Event | Working Notes of FIRE - 13th Forum for Information Retrieval Evaluation, FIRE-WN 2021 - Gandhinagar, India Duration: 13 Dec 2021 → 17 Dec 2021 |
Bibliographical note
Publisher Copyright:© 2021 Copyright for this paper by the Forum for Information Retrieval Evaluation, December 13-17, 2021, India.
Keywords
- Fake news
- supervised machine learning
- word/char n-grams