TY - UNPB
T1 - Online linear quadratic control
AU - Cohen, A.
AU - Hassidim, A.
AU - Koren, T.
AU - Lazic, N.
AU - Mansour, Y.
AU - Talwar, K.
PY - 2018/6/19
Y1 - 2018/6/19
N2 - We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee O(T−−√) regret under mild assumptions, where T is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to "strongly stable" policies that mix exponentially fast to a steady state.
AB - We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee O(T−−√) regret under mild assumptions, where T is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to "strongly stable" policies that mix exponentially fast to a steady state.
UR - http://scholar.google.com/scholar?num=3&hl=en&lr=&q=allintitle%3A%20Online%20linear%20quadratic%20control%2C%20author%3ACohen%20OR%20author%3AHassidim%20OR%20author%3AKoren%20OR%20author%3ALazic%20OR%20author%3AMansour%20OR%20author%3ATalwar&as_ylo=2018&as_yhi=&btnG=Search&as_vis=0
U2 - 10.48550/arXiv.1806.07104 Focus to learn more
DO - 10.48550/arXiv.1806.07104 Focus to learn more
M3 - פרסום מוקדם
VL - 7104
T3 - arXiv preprint arXiv:1806.,
BT - Online linear quadratic control
ER -