Abstract
This paper describes our submissions to SemEval-2022 subtask 4-A - 'Patronizing and Condescending Language Detection: Binary Classification". We developed different models for this subtask. We applied 11 supervised machine learning methods and 9 preprocessing methods. Our best submission was a model we built with BertForSequenceClassification. Our experiments indicate that pre-processing stage is a must for a successful model. The dataset for Subtask 1 is highly imbalanced. The F1-scores on the oversampled imbalanced training dataset were higher than the results on the original training dataset.
Original language | English |
---|---|
Title of host publication | SemEval 2022 - 16th International Workshop on Semantic Evaluation, Proceedings of the Workshop |
Editors | Guy Emerson, Natalie Schluter, Gabriel Stanovsky, Ritesh Kumar, Alexis Palmer, Nathan Schneider, Siddharth Singh, Shyam Ratan |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 519-524 |
Number of pages | 6 |
ISBN (Electronic) | 9781955917803 |
State | Published - 2022 |
Externally published | Yes |
Event | 16th International Workshop on Semantic Evaluation, SemEval 2022 - Seattle, United States Duration: 14 Jul 2022 → 15 Jul 2022 |
Publication series
Name | SemEval 2022 - 16th International Workshop on Semantic Evaluation, Proceedings of the Workshop |
---|
Conference
Conference | 16th International Workshop on Semantic Evaluation, SemEval 2022 |
---|---|
Country/Territory | United States |
City | Seattle |
Period | 14/07/22 → 15/07/22 |
Bibliographical note
Publisher Copyright:© 2022 Association for Computational Linguistics.