deepsparse 1.8.0


An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your appli

An inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application

Stars: 2946, Watchers: 2946, Forks: 169, Open Issues: 25

The neuralmagic/deepsparse repo was created 3 years ago and the last code push was 16 hours ago.
The project is very popular with an impressive 2946 github stars!

How to Install deepsparse

You can install deepsparse using pip

pip install deepsparse

or add it to a project with poetry

poetry add deepsparse

Package Details

Neuralmagic, Inc.
Neural Magic DeepSparse Community License
GitHub Repo:


  • Scientific/Engineering
  • Scientific/Engineering/Artificial Intelligence
  • Scientific/Engineering/Mathematics
  • Software Development
  • Software Development/Libraries/Python Modules
No  deepsparse  pypi packages just yet.


A list of common deepsparse errors.

Code Examples

Here are some deepsparse code examples and snippets.

GitHub Issues

The deepsparse package has 25 open issues on GitHub

  • Bug Fix: Make Numpy Array Outputs JSON Serializable for Server
  • [Text Generation] [Fix] Raise error when we use deepsparse engine and prompt_processing_length == sequence_lenght
  • [Text Generation] Optimize the slow update method in the KVCacheDecoder
  • [BugFix] Delay torch import until needed for deepsparse.transformers.eval_downstream
  • Update
  • DS eval for OPT on WikiText
  • [CLIP] Validation Script
  • Changes to support pass@k evaluation on the HumanEval dataset
  • [Text Generation] Turn off the (currently) inefficient external KV cache logic when internal KV cache management enabled
  • Implement OpenAI-compatible server
  • Generalize disabling batch size across engine interfaces
  • [Text Generation] KVCacheStorage Implementation
  • [Text Generation][Doc] Point to KV Cache Injection
  • Can't use DeepSparse with VITS ONNX file from Coqui TTS.
  • Encountered an issue when trying to optimize Donut model.

See more issues on GitHub

Related Packages & Articles

ludwig 0.10.3

Declarative machine learning: End-to-end machine learning pipelines using data-driven configurations.

auto-gptq 0.7.1

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

optimum 1.21.2

Optimum Library is an extension of the Hugging Face Transformers library, providing a framework to integrate third-party libraries from Hardware Partners and interface with their specific functionality.

deeplake 3.9.14

Deep Lake is a Database for AI powered by a unique storage format optimized for deep-learning and Large Language Model (LLM) based applications. It simplifies the deployment of enterprise-grade LLM-based products by offering storage for all data types (embeddings, audio, text, videos, images, pdfs, annotations, etc.), querying and vector search, data streaming while training models at scale, data versioning and lineage for all workloads, and integrations with popular tools such as LangChain, LlamaIndex, Weights & Biases, and many more.

onnx 1.16.1

Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides an open source format for AI models, both deep learning and traditional ML. It defines an extensible computation graph model, as well as definitions of built-in operators and standard data types. Currently we focus on the capabilities needed for inferencing (scoring).

datasets 2.20.0

HuggingFace community-driven open-source library of datasets