Information-Theory interpretation of the skip-Gram negative-Sampling objective function

Jacob Goldberger, Oren Melamud

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

In this paper, we define a measure of dependency between two random variables, based on the Jensen-Shannon (JS) divergence between their joint distribution and the product of their marginal distributions. Then, we show that word2vec’s skip-gram with negative sampling embedding algorithm finds the optimal low-dimensional approximation of this JS dependency measure between the words and their contexts. The gap between the optimal score and the low-dimensional approximation is demonstrated on a standard text corpus.

Original languageEnglish
Title of host publicationACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Short Papers)
PublisherAssociation for Computational Linguistics (ACL)
Pages167-171
Number of pages5
ISBN (Electronic)9781945626760
DOIs
StatePublished - 2017
Event55th Annual Meeting of the Association for Computational Linguistics, ACL 2017 - Vancouver, Canada
Duration: 30 Jul 20174 Aug 2017

Publication series

NameACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
Volume2

Conference

Conference55th Annual Meeting of the Association for Computational Linguistics, ACL 2017
Country/TerritoryCanada
CityVancouver
Period30/07/174/08/17

Bibliographical note

Publisher Copyright:
© 2017 Association for Computational Linguistics.

Funding

This work is supported by the Intel Collaborative Research Institute for Computational Intelligence (ICRI-CI).

FundersFunder number
Institut Claudius Regaud
Intel Collaboration Research Institute for Computational Intelligence

    Fingerprint

    Dive into the research topics of 'Information-Theory interpretation of the skip-Gram negative-Sampling objective function'. Together they form a unique fingerprint.

    Cite this