Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access

Oshri Naparstek, Kobi Cohen

Research output: Contribution to journalArticlepeer-review

328 Scopus citations

Abstract

We consider the problem of dynamic spectrum access for network utility maximization in multichannel wireless networks. The shared bandwidth is divided into $K$ orthogonal channels. In the beginning of each time slot, each user selects a channel and transmits a packet with a certain transmission probability. After each time slot, each user that has transmitted a packet receives a local observation indicating whether its packet was successfully delivered or not (i.e., ACK signal). The objective is a multi-user strategy for accessing the spectrum that maximizes a certain network utility in a distributed manner without online coordination or message exchanges between users. Obtaining an optimal solution for the spectrum access problem is computationally expensive, in general, due to the large-state space and partial observability of the states. To tackle this problem, we develop a novel distributed dynamic spectrum access algorithm based on deep multi-user reinforcement leaning. Specifically, at each time slot, each user maps its current state to the spectrum access actions based on a trained deep-Q network used to maximize the objective function. Game theoretic analysis of the system dynamics is developed for establishing design principles for the implementation of the algorithm. The experimental results demonstrate the strong performance of the algorithm.

Original languageEnglish
Article number8532121
Pages (from-to)310-323
Number of pages14
JournalIEEE Transactions on Wireless Communications
Volume18
Issue number1
DOIs
StatePublished - Jan 2019
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2002-2012 IEEE.

Keywords

  • Wireless networks
  • deep reinforcement learning
  • dynamic spectrum access
  • medium access control (MAC) protocols
  • multi-agent learning

Fingerprint

Dive into the research topics of 'Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access'. Together they form a unique fingerprint.

Cite this