Is the hierarchical all-to-all communication primitive released in the latest version? #5023
Unanswered
roychen9462
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am trying to understand the detail of how DeepSpeed speedup MoE inference process. In DeepSpeed-MoE paper section5.3, it mentioned two optimized communication to group and route token more efficiently. One is hierarchical all-to-all and parallelism coordinated communication optimization. Are these optimization implemented in latest version of DeepSpeed (v0.13.1)? Can any give some advice on it? Thanks!
Beta Was this translation helpful? Give feedback.
All reactions