Large scale online learning of image similarity through ranking

Gal Chechik, Varun Sharma, Uri Shalit, Samy Bengio

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

23 Scopus citations


Learning a measure of similarity between pairs of objects is a fundamental problem in machine learning. Pairwise similarity plays a crucial role in classification algorithms like nearest neighbors, and is practically important for applications like searching for images that are similar to a given image or finding videos that are relevant to a given video. In these tasks, users look for objects that are both visually similar and semantically related to a given object. Unfortunately, current approaches for learning semantic similarity are limited to small scale datasets, because their complexity grows quadratically with the sample size, and because they impose costly positivity constraints on the learned similarity functions. To address real-world large-scale AI problem, like learning similarity over all images on the web, we need to develop new algorithms that scale to many samples, many classes, and many features. The current abstract presents OASIS, an Online Algorithm for Scalable Image Similarity learning that learns a bilinear similarity measure over sparse representations. OASIS is an online dual approach using the passive-aggressive family of learning algorithms with a large margin criterion and an efficient hinge loss cost. Our experiments show that OASIS is both fast and accurate at a wide range of scales: for a dataset with thousands of images, it achieves better results than existing state-of-the-art methods, while being an order of magnitude faster. Comparing OASIS with different symmetric variants, provides unexpected insights into the effect of symmetry on the quality of the similarity. For large, web scale, datasets, OASIS can be trained on more than two million images from 150K text queries within two days on a single CPU. Human evaluations showed that 35% of the ten top images ranked by OASIS were semantically relevant to a query image. This suggests that query-independent similarity could be accurately learned even for large-scale datasets that could not be handled before.

Original languageEnglish
Title of host publicationPattern Recognition and Image Analysis - 4th Iberian Conference, IbPRIA 2009, Proceedings
Number of pages4
StatePublished - 2009
Externally publishedYes
Event4th Iberian Conference on Pattern Recognition and Image Analysis, IbPRIA 2009 - Povoa de Varzim, Portugal
Duration: 10 Jun 200912 Jun 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5524 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference4th Iberian Conference on Pattern Recognition and Image Analysis, IbPRIA 2009
CityPovoa de Varzim


Dive into the research topics of 'Large scale online learning of image similarity through ranking'. Together they form a unique fingerprint.

Cite this