Abstract
Robustness against adversarial attacks has recently been at the forefront of algorithmic design for machine learning tasks. In the adversarial streaming model, an adversary gives an algorithm a sequence of adaptively chosen updates u1,..., un as a data stream. The goal of the algorithm is to compute or approximate some predetermined function for every prefix of the adversarial stream, but the adversary may generate future updates based on previous outputs of the algorithm. In particular, the adversary may gradually learn the random bits internally used by an algorithm to manipulate dependencies in the input. This is especially problematic as many important problems in the streaming model require randomized algorithms, as they are known to not admit any deterministic algorithms that use sublinear space. In this paper, we introduce adversarially robust streaming algorithms for central machine learning and algorithmic tasks, such as regression and clustering, as well as their more general counterparts, subspace embedding, low-rank approximation, and coreset construction. For regression and other numerical linear algebra related tasks, we consider the row arrival streaming model. Our results are based on a simple, but powerful, observation that many importance sampling-based algorithms give rise to adversarial robustness which is in contrast to sketching based algorithms, which are very prevalent in the streaming literature but suffer from adversarial attacks. In addition, we show that the well-known merge and reduce paradigm in streaming is adversarially robust. Since the merge and reduce paradigm allows coreset constructions in the streaming setting, we thus obtain robust algorithms for k-means, k-median, k-center, Bregman clustering, projective clustering, principal component analysis (PCA) and non-negative matrix factorization. To the best of our knowledge, these are the first adversarially robust results for these problems yet require no new algorithmic implementations. Finally, we empirically confirm the robustness of our algorithms on various adversarial attacks and demonstrate that by contrast, some common existing algorithms are not robust.
Original language | English |
---|---|
Title of host publication | Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021 |
Editors | Marc'Aurelio Ranzato, Alina Beygelzimer, Yann Dauphin, Percy S. Liang, Jenn Wortman Vaughan |
Publisher | Neural information processing systems foundation |
Pages | 3544-3557 |
Number of pages | 14 |
ISBN (Electronic) | 9781713845393 |
State | Published - 2021 |
Event | 35th Conference on Neural Information Processing Systems, NeurIPS 2021 - Virtual, Online Duration: 6 Dec 2021 → 14 Dec 2021 |
Publication series
Name | Advances in Neural Information Processing Systems |
---|---|
Volume | 5 |
ISSN (Print) | 1049-5258 |
Conference
Conference | 35th Conference on Neural Information Processing Systems, NeurIPS 2021 |
---|---|
City | Virtual, Online |
Period | 6/12/21 → 14/12/21 |
Bibliographical note
Publisher Copyright:© 2021 Neural information processing systems foundation. All rights reserved.
Funding
Sandeep Silwal was supported in part by a NSF Graduate Research Fellowship Program. Samson Zhou was supported by a Simons Investigator Award of David P. Woodruff.
Funders | Funder number |
---|---|
National Science Foundation |