Predicting User Demography and Device from News Comments

Ohad Rozen, Joel Oren, Ariel Raviv

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Demographics of online users such as age and gender play an important role in personalized web applications, particularly in the News domain. However, it is difficult to directly obtain the demographic information of online users. Past works have attempted to predict user demography based on reading patterns obtained from news browsing data. However, such data can be very limited. Luckily, in recent years, posts and comments have become much prevalent among online users, and the comments from users of different demographics exhibit differences in contents and writing styles. Thus, comments can provide additional clues for demographic prediction. In this paper, we study predicting users' demographics based on both news browsing data and the associated user generated comments. To this end, we make a novel use of a recently introduced BERT-based model to embed each comment in the context of its associated article. We experiment on real-world datasets, and explore the contribution of both browsing data and user generated data in the task of predicting three different user attributes: gender, location type (e.g., rural vs. urban), and mobile device. Finally we show that our approach can effectively improve the performance of such predictions and outperforms baseline methods.

Original languageEnglish
Title of host publicationSIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages1995-1999
Number of pages5
ISBN (Electronic)9781450380379
DOIs
StatePublished - 11 Jul 2021
Event44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021 - Virtual, Online, Canada
Duration: 11 Jul 202115 Jul 2021

Publication series

NameSIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

Conference

Conference44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021
Country/TerritoryCanada
CityVirtual, Online
Period11/07/2115/07/21

Bibliographical note

Publisher Copyright:
© 2021 ACM.

Keywords

  • comments
  • demographic prediction
  • news
  • user modeling

Fingerprint

Dive into the research topics of 'Predicting User Demography and Device from News Comments'. Together they form a unique fingerprint.

Cite this