Abstract
Visual pose regression models estimate the camera pose from a query image with a single forward pass. Current models learn pose encoding from an image using deep convolutional networks which are trained per scene. The resulting encoding is typically passed to a multi-layer perceptron in order to regress the pose. In this work, we propose that scene-specific pose encoders are not required for pose regression and that encodings trained for visual similarity can be used instead. In order to test our hypothesis, we take a shallow architecture of several fully connected layers and train it with pre-computed encodings from a generic image retrieval model. We find that these encodings are not only sufficient to regress the camera pose, but that, when provided to a branching fully connected architecture, a trained model can achieve competitive results and even surpass current state-of-the-art pose regressors in some cases. Moreover, we show that for outdoor localization, the proposed architecture is the only pose regressor, to date, consistently localizing in under 2 meters and 5 degrees.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 3186-3192 |
| Number of pages | 7 |
| ISBN (Electronic) | 9781728188089 |
| DOIs | |
| State | Published - 2020 |
| Externally published | Yes |
| Event | 25th International Conference on Pattern Recognition, ICPR 2020 - Virtual, Milan, Italy Duration: 10 Jan 2021 → 15 Jan 2021 |
Publication series
| Name | Proceedings - International Conference on Pattern Recognition |
|---|---|
| ISSN (Print) | 1051-4651 |
Conference
| Conference | 25th International Conference on Pattern Recognition, ICPR 2020 |
|---|---|
| Country/Territory | Italy |
| City | Virtual, Milan |
| Period | 10/01/21 → 15/01/21 |
Bibliographical note
Publisher Copyright:© 2020 IEEE