Decentralized Learning for Channel Allocation in IoT Networks Over Unlicensed Bandwidth as a Contextual Multi-Player Multi-Armed Bandit Game

Wenbo Wang, Amir Leshem, Dusit Niyato, Zhu Han

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

We study a decentralized channel allocation problem in an ad-hoc Internet of Things network underlaying on the spectrum licensed to a primary cellular network. In the considered network, the impoverished channel sensing/probing capability and computational resource on the IoT devices make them difficult to acquire the detailed Channel State Information (CSI) for the shared multiple channels. In practice, the unknown patterns of the primary users' transmission activities and the time-varying CSI (e.g., due to small-scale fading or device mobility) also cause stochastic changes in the channel quality. Decentralized IoT links are thus expected to learn channel conditions online based on partial observations, while acquiring no information about the channels that they are not operating on. They also have to reach an efficient, collision-free solution of channel allocation with limited coordination. Our study maps this problem into a contextual multi-player, multi-armed bandit game, and proposes a purely decentralized, three-stage policy learning algorithm through trial-and-error. Theoretical analyses shows that the proposed scheme guarantees the IoT links to jointly converge to the socially optimal channel allocation with a sub-linear (i.e., polylogarithmic) regret with respect to the operational time. Simulations demonstrate that it strikes a good balance between efficiency and network scalability when compared with the other state-of-the-art decentralized bandit algorithms.

Original languageEnglish
Pages (from-to)3162-3178
Number of pages17
JournalIEEE Transactions on Wireless Communications
Volume21
Issue number5
DOIs
StatePublished - 1 May 2022

Bibliographical note

Publisher Copyright:
© 2002-2012 IEEE.

Keywords

  • Contextual multi-player multi-armed bandits
  • ad-hoc IoTs
  • decentralized learning
  • sub-linear regret

Fingerprint

Dive into the research topics of 'Decentralized Learning for Channel Allocation in IoT Networks Over Unlicensed Bandwidth as a Contextual Multi-Player Multi-Armed Bandit Game'. Together they form a unique fingerprint.

Cite this