Resource-Efficient Decentralized and Federated Learning

Distributed optimization has been a classic topic, yet is attracting significant interest recently in machine learning due to its numerous applications such as distributed training, multi-agent learning, federated optimization, and so on. Often, the scale of modern datasets has exceeded the capacity of a single machine, and privacy and communication constraints prevent information sharing in a centralized manner and necessitates distributed infrastructures. Broadly speaking, there are two types of distributed settings: a distributed/federated setting, where a parameter server aggregates and shares parameters across all agents; and a decentralized/network setting, where each agent only aggregates and shares parameters with its neighbors over a network topology. The canonical problem of empirical risk minimization in the distributed setting leads to intriguing trade-offs between computation and communication that are not well understood; moreover, data unbalancedness and heterogeneity across agents poses additional challenges in both algorithmic convergence and statistical efficacy, often exacerbated by additional bandwidth and privacy constraints.

Overview

Communication-Privacy Trade-offs

Communication-Efficient Federated and Decentralized Optimization

Vertical Federated Learning

Federated Reinforcement Learning