Contents

horovod 0.28.1

0

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Horovod is a powerful distributed training framework for Python that allows you to train deep learning models across multiple GPUs and servers quickly and efficiently. It falls under the category of distributed computing libraries. Built on top of TensorFlow, PyTorch, and other popular deep learning frameworks, Horovod simplifies the process of scaling up your model training by handling the complexities of distributed training under the hood.

Stars: 14077, Watchers: 14077, Forks: 2221, Open Issues: 400

The horovod/horovod repo was created 6 years ago and the last code push was 4 days ago.
The project is extremely popular with a mindblowing 14077 github stars!

How to Install horovod

You can install horovod using pip

pip install horovod

or add it to a project with poetry

poetry add horovod

Package Details

Author
The Horovod Authors
License
Apache 2.0
Homepage
https://github.com/horovod/horovod
PyPi:
https://pypi.org/project/horovod/
GitHub Repo:
https://github.com/horovod/horovod

Classifiers

  • Scientific/Engineering/Artificial Intelligence
No  horovod  pypi packages just yet.

Errors

A list of common horovod errors.

Code Examples

Here are some horovod code examples and snippets.

GitHub Issues

The horovod package has 400 open issues on GitHub

  • Cannot install horovod with tensorflow framework
  • mpirun check failed with version
  • Problem during installation
  • Elastic Horovod incompatible with tf.compat.v1.train.MonitoredTrainingSession
  • Pre-load dataset to memory
  • failed call to cuInit after calling hvd.init()
  • Build Horovod with temporarily installed CMake if necessary
  • learning rate can't update in resuming training
  • test_ray.py::test_gpu_ids_num_workers sometimes fails on Buildkite
  • can horovod work only with Gloo,without MPI?
  • Horovd + AMP Contrastive Learning Imprecision
  • GPU Head tests for elastic ray are flaky in CI
  • integrate with Lightning ecosystem CI
  • Error on installing: pip install horovod[pytorch] in Virtualbox ubuntu 18.04
  • Multi worker inference in Databricks

See more issues on GitHub

Related Packages & Articles

thinc 9.0.0

A refreshing functional take on deep learning, compatible with your favorite libraries

datasets 2.20.0

HuggingFace community-driven open-source library of datasets

clearml 1.16.2

ClearML - Auto-Magical Experiment Manager, Version Control, and MLOps for AI

nlp 0.4.0

HuggingFace/NLP is an open library of NLP datasets.

keras 3.4.1

Keras is a deep learning API written in Python, running on top of the machine learning platform TensorFlow. The core data structures of Keras are layers and models. The philosophy is to keep simple things simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code via subclassing).