Abstract
How can non-communicating agents learn to share congested resources efficiently? This is a challenging task when the agents can access the same resource simultaneously (in contrast to multi-agent multi-armed bandit problems) and the resource valuations differ among agents. We present a fully distributed algorithm for learning to share in congested environments and prove that the agents’ regret with respect to the optimal allocation is poly-logarithmic in the time horizon. Performance in the non-asymptotic regime is illustrated in numerical simulations. The distributed algorithm has applications in cloud computing and spectrum sharing.
| Original language | English |
|---|---|
| Article number | 111817 |
| Journal | Automatica |
| Volume | 169 |
| DOIs | |
| State | Published - Nov 2024 |
Bibliographical note
Publisher Copyright:© 2024 Elsevier Ltd
Keywords
- Congestion games
- Distributed learning
- Learning in dense environments
- Learning in games
- Poly-logarithmic regret