Abstract
Social posts and their comments are rich and interesting social data. In this study, we aim to classify comments as relevant or irrelevant to the content of their posts. Since the comments in social media are usually short, their bag-of-words (BoW) representations are highly sparse. We investigate four semantic vector representations for the relevance classification task. We investigate different types of large unlabeled data for learning the distributional representations. We also empirically demonstrate that expanding the input of the task to include the post text does not improve the classification performance over using only the comment text. We show that representing the comment in the post space is a cheap and good representation for comment relevance classification.
Original language | English |
---|---|
Title of host publication | Computational Linguistics and Intelligent Text Processing - 18th International Conference, CICLing 2017, Revised Selected Papers |
Editors | Alexander Gelbukh |
Publisher | Springer Verlag |
Pages | 241-254 |
Number of pages | 14 |
ISBN (Print) | 9783319771151 |
DOIs | |
State | Published - 2018 |
Externally published | Yes |
Event | 18th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2017 - Budapest, Hungary Duration: 17 Apr 2017 → 23 Apr 2017 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 10762 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 18th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2017 |
---|---|
Country/Territory | Hungary |
City | Budapest |
Period | 17/04/17 → 23/04/17 |
Bibliographical note
Publisher Copyright:© Springer Nature Switzerland AG 2018.
Keywords
- Comment relevance classification
- Machine learning
- Semantic analysis
- Social media
- Supervised learning