Abstract
Despite the increasing prevalence of deep neural networks, their applicability in resource-constrained devices is limited due to their computational load. While modern devices exhibit a high level of parallelism, real-time latency is still highly dependent on networks' depth. Although recent works show that below a certain depth, the width of shallower networks must grow exponentially, we presume that neural networks typically exceed this minimal depth to accelerate convergence and incrementally increase accuracy. This motivates us to transform pre-trained deep networks that already exploit such advantages into shallower forms. We propose a method that learns whether non-linear activations can be removed, allowing to fold consecutive linear layers into one. We use our method to provide more efficient alternatives to MobileNet and EfficientNet architectures on the ImageNet classification task. We release our code and trained models at https://github.com/LayerFolding/Layer-Folding.
Original language | English |
---|---|
State | Published - 2022 |
Externally published | Yes |
Event | 33rd British Machine Vision Conference Proceedings, BMVC 2022 - London, United Kingdom Duration: 21 Nov 2022 → 24 Nov 2022 |
Conference
Conference | 33rd British Machine Vision Conference Proceedings, BMVC 2022 |
---|---|
Country/Territory | United Kingdom |
City | London |
Period | 21/11/22 → 24/11/22 |
Bibliographical note
Publisher Copyright:© 2022. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms.