Seminar #5 - Colin Unger - Unity (OSDI'22)

Name: Seminar #5 - Colin Unger - Unity (OSDI'22)
Start: 2022-09-08T10:30:00Z
End: 2022-09-08T11:30:00Z
Location: Huxley 315 - Imperial College London

Seminar

Abstract

This paper presents Unity, the first system that jointly optimizes algebraic transformations and parallelization in distributed DNN training. Unity represents both parallelization and algebraic transformations as substitutions on a unified parallel computation graph (PCG), which simultaneously expresses the computation, parallelization, and communication of a distributed DNN training procedure.

Optimizations, in the form of graph substitutions, are automatically generated given a list of operator specifications, and are formally verified correct using an automated theorem prover. Unity then uses a novel hierarchical search algorithm to jointly optimize algebraic transformations and parallelization while maintaining scalability. The combination of these techniques provides a generic and extensible approach to optimizing distributed DNN training, capable of integrating new DNN operators, parallelization strategies, and model architectures with minimal manual effort.

We evaluate Unity on seven real-world DNNs running on up to 192 GPUs on 32 nodes and show that Unity outperforms existing DNN training frameworks by up to 3.6× while keeping optimization times under 20 minutes. Unity is available to use as part of the open-source DNN training framework FlexFlow at https://github.com/flexflow/flexflow.

Date

Sep 8, 2022 10:30 AM — 11:30 AM

Location

Huxley 315 - Imperial College London

Join Zoom Session!

Add to Calendar

Speaker Bio

Colin Unger is a second year PhD student at Stanford advised by Alex Aiken. He received his bachelor’s degree from UC Santa Barbara, where he worked with Giovanni Vigna and Christopher Kruegel on binary analysis. He is broadly interested in compilers, program analysis, and optimization, especially in emerging applications, and is currently focused on hardware-aware optimization of deep learning workloads.

Compilers DL Training Pipeline Parallelism Tensor Parallelism Data Parallelism Algebraic Optimization Verification Strategy Search