JCT at SemEval-2022 Task 6-A: Sarcasm Detection in Tweets Written in English and Arabic using Preprocessing Methods and Word N-grams

Yaakov HaCohen-Kerner, Matan Fchima, Ilan Meyrowitsch

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we describe our submissions to SemEval-2022 contest. We tackled subtask 6-A - 'iSarcasmEval: Intended Sarcasm Detection In English and Arabic - Binary Classification". We developed different models for two languages: English and Arabic. We applied 4 supervised machine learning methods, 6 preprocessing methods for English and 3 for Arabic, and 3 oversampling methods. Our best submitted model for the English test dataset was an SVC model that balanced the dataset using SMOTE and removed stop words. For the Arabic test dataset, our best submitted model was an SVC model that preprocessed removed longation.

Original languageEnglish
Title of host publicationSemEval 2022 - 16th International Workshop on Semantic Evaluation, Proceedings of the Workshop
EditorsGuy Emerson, Natalie Schluter, Gabriel Stanovsky, Ritesh Kumar, Alexis Palmer, Nathan Schneider, Siddharth Singh, Shyam Ratan
PublisherAssociation for Computational Linguistics (ACL)
Pages1031-1038
Number of pages8
ISBN (Electronic)9781955917803
StatePublished - 2022
Externally publishedYes
Event16th International Workshop on Semantic Evaluation, SemEval 2022 - Seattle, United States
Duration: 14 Jul 202215 Jul 2022

Publication series

NameSemEval 2022 - 16th International Workshop on Semantic Evaluation, Proceedings of the Workshop

Conference

Conference16th International Workshop on Semantic Evaluation, SemEval 2022
Country/TerritoryUnited States
CitySeattle
Period14/07/2215/07/22

Bibliographical note

Publisher Copyright:
© 2022 Association for Computational Linguistics.

Fingerprint

Dive into the research topics of 'JCT at SemEval-2022 Task 6-A: Sarcasm Detection in Tweets Written in English and Arabic using Preprocessing Methods and Word N-grams'. Together they form a unique fingerprint.

Cite this