New network topologies have been created for supercomputers to reduce the latency and reduce the cost of the network. Dragonfly is one of them. Yet, the communication algorithms used by distributed applications are generic and do not take into account the network's properties. With topology-aware algorithms, we expect to increase the performance of the communication algorithms, making distributed applications faster.
In this seminar, we study various communication algorithms for Scatter and AllGather using CODES, an event driven network simulator. Results show both expected and counterintuitive results, and demonstrate that the topology and the hardware used in the network (routers) must be taken into account to design efficient communication algorithms.
Bio:
Nathanaël Cheriere is pre-doc (between Master and PhD) student at Ecole Normale Supérieure de Rennes (France) where he also got his master degree. He has been working as an intern with Rob Ross and Matthieu Dorier since January. Before his time at ANL, he worked on many aspects of distributed computing: scheduling algorithms at Inria Grenoble, theoretical scheduling at University of North Carolina Charlotte, MapReduce at Inria Rennes, and data storage at Microsoft Research Cambridge. He will start a PhD under the supervision of Gabriel Antoniu and Shadi Ibrahim starting September.