Utilizing data driven methods to identify gender bias in LinkedIn profiles.

Research output: Contribution to journalArticlepeer-review

20 Scopus citations

Abstract

The growing use of Artificial Intelligence-enabled recruitment systems has become an important component of modern talent recruitment, particularly through social networks such as LinkedIn and Facebook. However, data overflow embedded in recruitment systems, based on Natural Language Processing (NLP) methods, may result in unconscious gender bias. The purpose of this work is to utilize a set of methods to analyze and detect textual bias in different groups. We analyzed a training dataset of fourteen thousand LinkedIn profiles, which was provided by a company named Talenya, and included job-candidates that fit IT-related positions. We aimed to detect textual self-presentation gender gap patterns, utilizing Term Frequency - Inverse Document Frequency (TF-IDF), word2vec and the Universal Sentence Encoder (USE) to code the data, and applied the kernel two-sample test for the purpose of determining whether men's and women's LinkedIn profiles have the same distribution. Additionally, we focused on quantifying and identifying repetition in skills representation in the LinkedIn profile by applying the TF-IDF and cosine similarity tools and compared the repetitiveness pattern of men's and women's profiles. Gender-based analysis was also carried out on smaller, more homogeneous groups of candidates, who share the same position type, geographical location and organizational seniority level. Finally, we provide theoretical and practical implications. • Machine learning methods are used to detect gender differences in LinkedIn profiles. • The kernel-two-sample test shows mild gender differences in the numeric data. • NLP and machine learning methods identify gender difference in textual data. • Location, position type and seniority level are found to be differentiating factors. • Addition of artificial skills to the LinkedIn Profile is necessary for some sub-group.
Original languageEnglish
Article number103423
Number of pages1
JournalInformation Processing and Management
Volume60
Issue number5
DOIs
StatePublished - 1 Sep 2023

Bibliographical note

Publisher Copyright:
© 2023 Elsevier Ltd

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 5 - Gender Equality
    SDG 5 Gender Equality

Keywords

  • Gender inequality
  • Machine learning
  • Sex discrimination
  • Natural language processing
  • Implicit bias
  • Gender mainstreaming
  • Cosine Similarity
  • Kernel two-sample test
  • LinkedIn profiles
  • Textual bias
  • TF-IDF
  • Universal sentence encoder

Fingerprint

Dive into the research topics of 'Utilizing data driven methods to identify gender bias in LinkedIn profiles.'. Together they form a unique fingerprint.

Cite this