Abstract
In this paper, we define a measure of dependency between two random variables, based on the Jensen-Shannon (JS) divergence between their joint distribution and the product of their marginal distributions. Then, we show that word2vec’s skip-gram with negative sampling embedding algorithm finds the optimal low-dimensional approximation of this JS dependency measure between the words and their contexts. The gap between the optimal score and the low-dimensional approximation is demonstrated on a standard text corpus.
| Original language | English |
|---|---|
| Title of host publication | ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Short Papers) |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 167-171 |
| Number of pages | 5 |
| ISBN (Electronic) | 9781945626760 |
| DOIs | |
| State | Published - 2017 |
| Event | 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017 - Vancouver, Canada Duration: 30 Jul 2017 → 4 Aug 2017 |
Publication series
| Name | ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) |
|---|---|
| Volume | 2 |
Conference
| Conference | 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017 |
|---|---|
| Country/Territory | Canada |
| City | Vancouver |
| Period | 30/07/17 → 4/08/17 |
Bibliographical note
Publisher Copyright:© 2017 Association for Computational Linguistics.
Funding
This work is supported by the Intel Collaborative Research Institute for Computational Intelligence (ICRI-CI).
| Funders |
|---|
| Institut Claudius Regaud |
| Intel Collaboration Research Institute for Computational Intelligence |
Fingerprint
Dive into the research topics of 'Information-Theory interpretation of the skip-Gram negative-Sampling objective function'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver