Robust Multi-agent Multi-armed Bandits: From Corruption-Resilience to Byzantine-Resilience
时间:2026年4月17日(星期五)15:00
地点:苏州校区南雍楼西122室
陈程,副教授
华东师范大学 软件工程学院
摘 要
Cooperative multi-agent multi-armed bandits (CMA2B) investigate how multiple agents collaborate to minimize the regret of a multi-armed bandit problem. Although this setting has been extensively studied, most existing algorithms remain vulnerable to various forms of adversarial manipulation. In this talk, we first introduce a CMA2B framework that is robust to adversarial corruption, where an adversary can corrupt the reward observations of all agents under a limited corruption budget. We then consider a more realistic scenario in which the adversary can attack only a small subset of agents. We show that in this case, the impact of adversarial attacks can be almost completely eliminated, and that the framework is inherently robust in the Byzantine setting, where an unknown fraction of agents may arbitrarily select arms and spread incorrect information.