Supervised Domain Adaptation by transferring both the parameter set and its gradient

Shaya Goodman, Hayit Greenspan, Jacob Goldberger

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

A well-known obstacle in the successful implementation of deep learning-based systems to real-world problems is the performance degradation that occurs when applying a network that was trained on data collected in one domain, to data from a different domain. In this study, we focus on the Supervised Domain Adaptation (SDA) setup where we assume the availability of a small amount of labeled data from the target domain. Our approach is based on transferring the gradient history of the pre-training phase to the fine-tuning phase in addition to the parameter set to improve the generalization achieved during pre-training while fine-tuning the model. We present two schemes to transfer the gradient's information. Mixed Minibatch Transfer Learning (MMTL) is based on using examples from both the source and target domains and Optimizer-Continuation Transfer Learning (OCTL) preserves the gradient history when shifting from training to fine-tuning. This approach is also applicable to the more general setup of transfer learning across different tasks. We show that our methods outperform the state-of-the-art at different levels of data scarcity from the target domain, on multiple datasets and tasks involving both scenery and medical images.

Original languageEnglish
Article number126828
JournalNeurocomputing
Volume560
DOIs
StatePublished - 1 Dec 2023

Bibliographical note

Publisher Copyright:
© 2023 Elsevier B.V.

Funding

This study was partially supported by the Israel Ministry of Science and Technology .

FundersFunder number
Ministry of science and technology, Israel

    Keywords

    • Catastrophic forgetting
    • Gradient transfer
    • MRI segmentation
    • Medical site adaptation
    • Supervised domain adaptation
    • Transfer learning

    Fingerprint

    Dive into the research topics of 'Supervised Domain Adaptation by transferring both the parameter set and its gradient'. Together they form a unique fingerprint.

    Cite this