Maintaining the ratio of network injection bandwidth (the rate at which a node injects data into the network fabric) to computational FLOPs seen in today's leadership class systems is challenging in terms of cost, power and physical size in the design of exascale computing architectures. This study evaluates the overall performance impact of decreasing network injection bandwidth relative to increasing computational FLOPs. Several approaches were taken, with the ultimate goal of projecting the network injection bandwidth impact on exascale time to solution for the most heavily used scientific simulation codes at the ALCF. This talk will describe these approaches and the results that were obtained.
Bio:
Paul Coffman is a Principal Scientific Applications Engineering Specialist in the Performance Engineering Group at the Argonne Leadership Computing Facility