TY - UNPB
T1 - LSH Microbatches for Stochastic Gradients: Value in Rearrangement
AU - Buchnik, E.
AU - Cohen, E.
AU - Hassidim, A.
AU - Matias, Y.
PY - 2018/9/28
Y1 - 2018/9/28
N2 - Metric embeddings are immensely useful representations of associations between entities (images, users, search queries, words, and more). Embeddings are learned by optimizing a loss objective of the general form of a sum over example associations. Typically, the optimization uses stochastic gradient updates over minibatches of examples that are arranged independently at random. In this work, we propose the use of {\em structured arrangements} through randomized {\em microbatches} of examples that are more likely to include similar ones. We make a principled argument for the properties of our arrangements that accelerate the training and present efficient algorithms to generate microbatches that respect the marginal distribution of training examples. Finally, we observe experimentally that our structured arrangements accelerate training by 3-20\%. Structured arrangements emerge as a powerful and novel performance knob for SGD that is independent and complementary to other SGD hyperparameters and thus is a candidate for wide deployment.
AB - Metric embeddings are immensely useful representations of associations between entities (images, users, search queries, words, and more). Embeddings are learned by optimizing a loss objective of the general form of a sum over example associations. Typically, the optimization uses stochastic gradient updates over minibatches of examples that are arranged independently at random. In this work, we propose the use of {\em structured arrangements} through randomized {\em microbatches} of examples that are more likely to include similar ones. We make a principled argument for the properties of our arrangements that accelerate the training and present efficient algorithms to generate microbatches that respect the marginal distribution of training examples. Finally, we observe experimentally that our structured arrangements accelerate training by 3-20\%. Structured arrangements emerge as a powerful and novel performance knob for SGD that is independent and complementary to other SGD hyperparameters and thus is a candidate for wide deployment.
UR - http://scholar.google.com/scholar?num=3&hl=en&lr=&q=allintitle%3A%20LSH%20Microbatches%20for%20Stochastic%20Gradients%3A%20Value%20in%20Rearrangement%2C%20author%3ABuchnik%20OR%20author%3ACohen%20OR%20author%3AHassidim%20OR%20author%3AMatias&as_ylo=2018&as_yhi=&btnG=Search&as_vis=0
UR - https://openreview.net/forum?id=r1erRoCqtX
M3 - פרסום מוקדם
VL - 5389
T3 - arXiv preprint arXiv:1803.,
BT - LSH Microbatches for Stochastic Gradients: Value in Rearrangement
ER -