Efficient shallow learning mechanism as an alternative to deep learning

Ofek Tevet, Ronit D. Gross, Shiri Hodassman, Tal Rogachevsky, Yarden Tzach, Yuval Meir, Ido Kanter

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Deep learning architectures comprising tens or even hundreds of convolutional and fully-connected hidden layers differ greatly from the shallow architecture of the brain. Here, we demonstrate that by increasing the relative number of filters per layer of a generalized shallow architecture, the error rates decay as a power law to zero. Additionally, a quantitative method to measure the performance of a single filter, shows that each filter identifies small clusters of possible output labels, with additional noise selected as labels outside the clusters. This average noise per filter also decays for a given generalized architecture as a power law with an increasing number of filters per layer, forming the underlying mechanism of efficient shallow learning. The results are supported by the training of the generalized LeNet-3, VGG-5, and VGG-16 on CIFAR-100 and suggest an increase in the noise power law exponent for deeper architectures. The presented underlying shallow learning mechanism calls for its further quantitative examination using various databases and shallow architectures.

Original languageEnglish
Article number129513
JournalPhysica A: Statistical Mechanics and its Applications
Volume635
DOIs
StatePublished - 1 Feb 2024

Bibliographical note

Publisher Copyright:
© 2024 Elsevier B.V.

Funding

This work was supported by the Israel Science Foundation [grant number 346/22 ].

FundersFunder number
Israel Science Foundation346/22

    Keywords

    • Deep learning
    • Machine learning
    • Shallow learning

    Fingerprint

    Dive into the research topics of 'Efficient shallow learning mechanism as an alternative to deep learning'. Together they form a unique fingerprint.

    Cite this