Human pose estimation using deep consensus voting

Ita Lifshitz, Ethan Fetaya, Shimon Ullman

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

103 Scopus citations

Abstract

In this paper we consider the problem of human pose estimation from a single still image. We propose a novel approach where each location in the image votes for the position of each keypoint using a convolutional neural net. The voting scheme allows us to utilize information from the whole image, rather than rely on a sparse set of keypoint locations. Using dense, multi-target votes, not only produces good keypoint predictions, but also enables us to compute image-dependent joint keypoint probabilities by looking at consensus voting. This differs from most previous methods where joint probabilities are learned from relative keypoint locations and are independent of the image. We finally combine the keypoints votes and joint probabilities in order to identify the optimal pose configuration. We show our competitive performance on the MPII Human Pose and Leeds Sports Pose datasets.

Original languageEnglish
Title of host publicationComputer Vision - 14th European Conference, ECCV 2016, Proceedings
EditorsBastian Leibe, Nicu Sebe, Max Welling, Jiri Matas
PublisherSpringer Verlag
Pages246-260
Number of pages15
ISBN (Print)9783319464749
DOIs
StatePublished - 2016
Externally publishedYes
Event14th European Conference on Computer Vision, ECCV 2016 - Amsterdam, Netherlands
Duration: 8 Oct 201616 Oct 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9906 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference14th European Conference on Computer Vision, ECCV 2016
Country/TerritoryNetherlands
CityAmsterdam
Period8/10/1616/10/16

Bibliographical note

Publisher Copyright:
© Springer International Publishing AG 2016.

Fingerprint

Dive into the research topics of 'Human pose estimation using deep consensus voting'. Together they form a unique fingerprint.

Cite this