Detection of mental health conditions from Reddit via deep contextualized representations

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

61 Scopus citations

Abstract

We address the problem of automatic detection of psychiatric disorders from the linguistic content of social media posts. We build a large scale dataset of Reddit posts from users with eight disorders and a control user group. We extract and analyze linguistic characteristics of posts and identify differences between diagnostic groups. We build strong classification models based on deep contextualized word representations and show that they outperform previously applied statistical models with simple linguistic features by large margins. We compare user-level and post-level classification performance, as well as an ensembled multiclass model.

Original languageEnglish
Title of host publicationEMNLP 2020 - 11th International Workshop on Health Text Mining and Information Analysis, LOUHI 2020, Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages147-156
Number of pages10
ISBN (Electronic)9781952148811
DOIs
StatePublished - 2020
Externally publishedYes
Event11th International Workshop on Health Text Mining and Information Analysis, LOUHI 2020, co-located with EMNLP 2020 - Virtual, Online
Duration: 20 Nov 2020 → …

Publication series

NameEMNLP 2020 - 11th International Workshop on Health Text Mining and Information Analysis, LOUHI 2020, Proceedings of the Workshop

Conference

Conference11th International Workshop on Health Text Mining and Information Analysis, LOUHI 2020, co-located with EMNLP 2020
CityVirtual, Online
Period20/11/20 → …

Bibliographical note

Publisher Copyright:
© 2020 Association for Computational Linguistics

Fingerprint

Dive into the research topics of 'Detection of mental health conditions from Reddit via deep contextualized representations'. Together they form a unique fingerprint.

Cite this