Layer Folding: Neural Network Depth Reduction using Activation Linearization

Amir Ben Dror, Niv Zehngut, Avraham Raviv, Evgeny Artyomov, Ran Vitek

Research output: Contribution to conferencePaperpeer-review

Abstract

Despite the increasing prevalence of deep neural networks, their applicability in resource-constrained devices is limited due to their computational load. While modern devices exhibit a high level of parallelism, real-time latency is still highly dependent on networks' depth. Although recent works show that below a certain depth, the width of shallower networks must grow exponentially, we presume that neural networks typically exceed this minimal depth to accelerate convergence and incrementally increase accuracy. This motivates us to transform pre-trained deep networks that already exploit such advantages into shallower forms. We propose a method that learns whether non-linear activations can be removed, allowing to fold consecutive linear layers into one. We use our method to provide more efficient alternatives to MobileNet and EfficientNet architectures on the ImageNet classification task. We release our code and trained models at https://github.com/LayerFolding/Layer-Folding.

Original languageEnglish
StatePublished - 2022
Externally publishedYes
Event33rd British Machine Vision Conference Proceedings, BMVC 2022 - London, United Kingdom
Duration: 21 Nov 202224 Nov 2022

Conference

Conference33rd British Machine Vision Conference Proceedings, BMVC 2022
Country/TerritoryUnited Kingdom
CityLondon
Period21/11/2224/11/22

Bibliographical note

Publisher Copyright:
© 2022. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms.

Fingerprint

Dive into the research topics of 'Layer Folding: Neural Network Depth Reduction using Activation Linearization'. Together they form a unique fingerprint.

Cite this