Bayesian Uncertainty for Gradient Aggregation in Multi-Task Learning

Idan Achituve, Idit Diamant, Arnon Netzer, Gal Chechik, Ethan Fetaya

Research output: Contribution to journalConference articlepeer-review

Abstract

As machine learning becomes more prominent there is a growing demand to perform several inference tasks in parallel. Multi-task learning (MTL) addresses this challenge by learning a single model that solves several tasks simultaneously and efficiently. Often optimizing MTL models entails first computing the gradient of the loss for each task, and then aggregating all the gradients to obtain a combined update direction. However, common methods following this approach do not consider an important aspect, the sensitivity in the dimensions of the gradients. Some dimensions may be more lenient for changes while others may be more restrictive. Here, we introduce a novel gradient aggregation procedure using Bayesian inference. We place a probability distribution over the task-specific parameters, which in turn induce a distribution over the gradients of the tasks. This valuable information allows us to quantify the uncertainty associated with each of the gradients' dimensions which is factored in when aggregating them. We empirically demonstrate the benefits of our approach in a variety of datasets, achieving state-of-the-art performance.

Original languageEnglish
Pages (from-to)117-134
Number of pages18
JournalProceedings of Machine Learning Research
Volume235
StatePublished - 2024
Event41st International Conference on Machine Learning, ICML 2024 - Vienna, Austria
Duration: 21 Jul 202427 Jul 2024

Bibliographical note

Publisher Copyright:
Copyright 2024 by the author(s)

Fingerprint

Dive into the research topics of 'Bayesian Uncertainty for Gradient Aggregation in Multi-Task Learning'. Together they form a unique fingerprint.

Cite this