From Federated to Fog Learning: Expanding the Frontier of Model Training over Contemporary Wireless Network Systems

Room: 1302, Bldg: SCDI, 500 El Camino Real, Santa Clara University, Santa Clara, California, United States, 95053

Fog learning is an emerging paradigm for optimizing the orchestration of artificial intelligence services over contemporary network systems. Different from existing distributed techniques such as federated learning, fog learning emphasizes intrinsically in its design the unique node, network, and data properties encountered in today’s fog networks that span computing elements from the edge to the cloud. An important thread of research in fog learning has been on understanding the role that local topologies formed on an ad-hoc basis among proximal groups of heterogeneous computing elements can play in elevating the achievable tradeoff between intelligence quality and resource efficiency. In this talk, I will discuss recent results on the analysis of fog learning processes which give insights into the impact that these topologies, along with other properties such as model characteristics and fog decision parameters, have on global training performance. Additionally, I will discuss the development of adaptive control methodologies that leverage such relationships for jointly optimizing relevant fog learning metrics. Speaker(s): Dr. Christopher G. Brinton ***CANCELED*** Room: 1302, Bldg: SCDI, 500 El Camino Real, Santa Clara University, Santa Clara, California, United States, 95053

Lightwave Fabrics: At-Scale Optical Circuit Switching for Datacenter and Machine Learning Systems

Northeastern University Welcome Center Space, 75 E Santa Clara Street, San Jose, California, United States, 95113, Virtual: https://events.vtools.ieee.org/m/450615

Abstract We describe our experience developing what we believe to be the world’s first large-scale production deployments of lightwave fabrics used for both datacenter networking and machine-learning applications. Optical circuit switches and optical transceivers developed in-house have produced a lightwave fabric that is reconfigurable, low latency, rate agnostic, and highly available. These fabrics have provided substantial benefits for long-lived traffic patterns in tightly-coupled machine learning clusters. We also report results for a large-scale ML superpod with 4096 tensor processing unit chips that has more than one exaflop of computing power. Speaker(s): Dr. Kevin Yasumura, Agenda: 6:30 – 7:00 PM Registration & Networking 7:00 – 7:45 PM Invited Talk 7:45 – 8:00 PM Questions & Answers Northeastern University Welcome Center Space, 75 E Santa Clara Street, San Jose, California, United States, 95113, Virtual: https://events.vtools.ieee.org/m/450615