DeepSpeed v0.11.0
New features
- DeepSpeed-VisualChat: Improve Your Chat Experience with Multi-Round Multi-Image Inputs [English] [中文] [日本語]
- Announcing the DeepSpeed4Science Initiative: Enabling large-scale scientific discovery through sophisticated AI system technologies [DeepSpeed4Science website] [Tutorials] [Blog] [中文] [日本語]
What's Changed
- added a model check for use_triton in deepspeed by @stephen-youn in #4266
- Update release and bump patch versioning flow by @loadams in #4286
- README update by @tjruwase in #4303
- Update README.md by @NinoRisteski in #4316
- Handle empty parameter groups by @tjruwase in #4277
- Clean up modeling code by @loadams in #4320
- Fix Zero3 contiguous grads, reduce scatter false accuracy issue by @nelyahu in #4321
- Add release version checking by @loadams in #4328
- clear redundant timers by @starkhu in #4308
- DS-Chat BLOOM: Fix Attention mask by @lekurile in #4338
- Fix a bug in the implementation of dequantization for inference by @sakogan in #3433
- Suppress noise by @tjruwase in #4310
- Fix skipped inference tests by @mrwyattii in #4336
- Fix autotune to support Triton 2.1 by @stephen-youn in #4340
- Pass base_dir to model files can be loaded for auto-tp/meta-tensor. by @awan-10 in #4348
- Support InternLM by @wangruohui in #4137
- DeepSpeed4Science by @conglongli in #4357
- fix deepspeed4science links by @conglongli in #4358
- Add the policy to run llama model from the official repo by @RezaYazdaniAminabadi in #4313
- Check inference input_id tokens length by @mrwyattii in #4349
- add deepspeed4science blog link by @conglongli in #4364
- Update conda env to have max pydantic version by @loadams in #4362
- Enable workflow dispatch on Torch 1.10 CI tests by @loadams in #4361
- deepspeed4science chinese blog by @conglongli in #4366
- deepspeed4science japanese blog by @conglongli in #4369
- Openfold fix by @cctry in #4368
- [BUG] add the missing method to MPS accelerator by @cli99 in #4363
- Fix multinode runner to properly append to PDSH_SSH_ARGS_APPEND by @loadams in #4373
- Fix min torch version by @tjruwase in #4375
- Fix llama meta tensor loading in AutoTP and kernel injected inference by @zeyugao in #3608
- adds triton flash attention2 kernel by @stephen-youn in #4337
- Allow multiple inference engines in single script by @mrwyattii in #4384
- Save/restore step in param groups with zero 1 or 2 by @tohtana in #4396
- Fix incorrect assignment of self.quantized_nontrainable_weights by @VeryLazyBoy in #4399
- update deepspeed4science blog by @conglongli in #4408
- Add torch no grad condition by @ajindal1 in #4391
- Update nv-transformers workflow to use cu11.6 by @loadams in #4412
- Add condition when dimension is greater than 2 by @ajindal1 in #4390
- [CPU] Add CPU AutoTP UT. by @Yejing-Lai in #4263
- fix cpu loading model partition OOM by @Yejing-Lai in #4353
- Update cpu_inference checkout action by @loadams in #4424
- Zero infinity xpu support by @Liangliang-Ma in #4130
- [CCLBackend] Using parallel memcpy for inference_all_reduce by @delock in #4404
- Change default
set_to_none=true
inzero_grad
methods by @Jackmin801 in #4438 - Small docstring fix by @Jackmin801 in #4431
- fix: check-license by @Jackmin801 in #4432
- Fixup check release version script by @loadams in #4413
- Enable ad-hoc running of cpu_inference by @loadams in #4444
- Fix wrong documentation of
ignore_unused_parameters
by @UniverseFly in #4418 - DeepSpeed-VisualChat Blog by @xiaoxiawu-microsoft in #4446
- Fix a bug in DeepSpeedMLP by @sakogan in #4389
- documenting load_from_fp32_weights config parameter by @clumsy in #4449
- Add Japanese translation of DS-VisualChat blog by @tohtana in #4454
- fix blog format by @conglongli in #4456
- Update README-Japanese.md by @conglongli in #4457
- DeepSpeed-VisualChat Chinese blog by @conglongli in #4458
- CI fix for torch 2.1 release by @mrwyattii in #4452
- fix lm head overriden issue, move it from checkpoint in-loop loading … by @sywangyi in #4206
- feat: add Lion by @enneamer in #4331
- pipe engine eval_batch: add option to disable loss broadcast by @nelyahu in #4326
- Add release flow by @loadams in #4467
New Contributors
- @nelyahu made their first contribution in #4321
- @starkhu made their first contribution in #4308
- @sakogan made their first contribution in #3433
- @cctry made their first contribution in #4368
- @zeyugao made their first contribution in #3608
- @VeryLazyBoy made their first contribution in #4399
- @ajindal1 made their first contribution in #4391
- @Liangliang-Ma made their first contribution in #4130
- @Jackmin801 made their first contribution in #4438
- @UniverseFly made their first contribution in #4418
- @enneamer made their first contribution in #4331
Full Changelog: v0.10.3...v0.11.0