deepspeed 0.15.2
DeepSpeed
is a Python package developed by Microsoft that provides a deep learning optimization library designed to scale across multiple GPUs and servers. It is capable of training models with billions or even trillions of parameters, achieving excellent system throughput and efficiently scaling to thousands of GPUs.
DeepSpeed
is particularly useful for training and inference of large language models, and it falls under the category of Machine Learning Frameworks and Libraries. It is designed to work with PyTorch
and offers system innovations such as Zero Redundancy Optimizer (ZeRO), 3D parallelism, and model-parallelism to enable efficient training of large models.