Factorized CRF with batch normalization based on the entire training data

Eran Goldman, Jacob Goldberger

Research output: Contribution to journalConference articlepeer-review


Batch normalization (BN) is a key component of most neural network architectures. A major weakness of Batch Normalization is its critical dependence on having a reasonably large batch size, due to the inherent approximation of estimating the mean and variance with a single batch of data. Another weakness is the difficulty of applying BN in autoregressive or structured models. In this study we show that it is feasible to calculate the mean and variance using the entire training dataset instead of standard BN for any network node obtained as a linear function of the input features. We dub this method Full Batch Normalization (FBN). Our main focus is on a factorized autoregressive CRF model where we show that FBN is applicable, and allows for the integration of BN into the linear-chain CRF likelihood. The improved performance of FBN is illustrated on the huge SKU dataset that contains images of retail store product displays.

Original languageEnglish
Pages (from-to)2780-2784
Number of pages5
JournalProceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
StatePublished - 2021
Event2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada
Duration: 6 Jun 202111 Jun 2021

Bibliographical note

Funding Information:
∗Thanks to Trax Retail for funding this research and providing the data.

Publisher Copyright:
©2021 IEEE


  • Batch normalization
  • CRF
  • FBN


Dive into the research topics of 'Factorized CRF with batch normalization based on the entire training data'. Together they form a unique fingerprint.

Cite this