Differentiable scene graphs

Moshiko Raboh, Roei Herzig, Jonathan Berant, Gal Chechik, Amir Globerson

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

26 Scopus citations

Abstract

Reasoning about complex visual scenes involves perception of entities and their relations. Scene Graphs (SGs) provide a natural representation for reasoning tasks, by assigning labels to both entities (nodes) and relations (edges). Reasoning systems based on SGs are typically trained in a two-step procedure: first, a model is trained to predict SGs from images, and next a separate model is trained to reason based on the predicted SGs. However, it would seem preferable to train such systems in an end-to-end manner. The challenge, which we address here is that scene-graph representations are non-differentiable and therefore it isn't clear how to use them as intermediate components. Here we propose Differentiable Scene Graphs (DSGs), an image representation that is amenable to differentiable end-to-end optimization, and requires supervision only from the downstream tasks. DSGs provide a dense representation for all regions and pairs of regions, and do not spend modelling capacity on regions of the image that do not contain objects or relations of interest. We evaluate our model on the challenging task of identifying referring relationships (RR) in three benchmark datasets: Visual Genome, VRD and CLEVR. Using DSGs as an intermediate representation leads to new state-of-the-art performance. The full code is available at https://github.com/shikorab/DSG.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1477-1486
Number of pages10
ISBN (Electronic)9781728165530
DOIs
StatePublished - Mar 2020
Event2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020 - Snowmass Village, United States
Duration: 1 Mar 20205 Mar 2020

Publication series

NameProceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020

Conference

Conference2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020
Country/TerritoryUnited States
CitySnowmass Village
Period1/03/205/03/20

Bibliographical note

Publisher Copyright:
© 2020 IEEE.

Funding

This project was funded by the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant ERC HOLI 819080). Acknowledgments: This project was funded by the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant ERC HOLI 819080).

FundersFunder number
Horizon 2020 Framework Programme802800
European Commission
Horizon 2020819080

    Fingerprint

    Dive into the research topics of 'Differentiable scene graphs'. Together they form a unique fingerprint.

    Cite this