Publications
(* denotes equal contributions)
Revisiting the Time Cost Model of AllReduce
Dian Xiong, Li Chen, Youhe Jiang, Dan Li, Shuai Wang, Songtao Wang
Arxiv
| paper | code |
FlashFlex: Accommodating Large Language Model Training over Heterogeneous Environment
Ran Yan*, Youhe Jiang*, Wangcheng Tao, Xiaonan Nie, Bin Cui, Binhang Yuan
Arxiv
| paper | code |
HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment
Youhe Jiang*, Ran Yan*, Xiaozhe Yao*, Yang Zhou, Beidi Chen, Binhang Yuan
ICML 2024
| paper | code |
Improving Automatic Parallel Training via Balanced Memory Workload Optimization
Yujie Wang, Youhe Jiang, Xupeng Miao, Fangcheng Fu, Shenhan Zhu, Xiaonan Nie, Yaofeng Tu, Bin Cui
TKDE 2024
| paper | code |
OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning
Youhe Jiang, Fangcheng Fu, Xupeng Miao, Xiaonan Nie, Bin Cui
IJCAI 2023
| paper | code |
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism
Xupeng Miao*, Yujie Wang*, Youhe Jiang*, Chunan Shi, Xiaonan Nie, Hailin Zhang, Bin Cui
VLDB 2023
| paper | code |
OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning
Youhe Jiang, Xupeng Miao, Xiaonan Nie, Bin Cui
ICML 2023 workshop
| paper | code |
2D-HRA: Two-Dimensional Hierarchical Ring-Based All-Reduce Algorithm in Large-Scale Distributed Machine Learning
Youhe Jiang, Huaxi Gu, Yunfeng Lu, Xiaoshan Yu
IEEE Access 2020
| paper | code |