Contents

ctranslate2 4.1.0

0

Fast inference engine for Transformer models

Fast inference engine for Transformer models

Stars: 2737, Watchers: 2737, Forks: 240, Open Issues: 120

The OpenNMT/CTranslate2 repo was created 4 years ago and the last code push was 6 hours ago.
The project is very popular with an impressive 2737 github stars!

How to Install ctranslate2

You can install ctranslate2 using pip

pip install ctranslate2

or add it to a project with poetry

poetry add ctranslate2

Package Details

Author
OpenNMT
License
MIT
Homepage
https://opennmt.net
PyPi:
https://pypi.org/project/ctranslate2/
Documentation:
https://opennmt.net/CTranslate2
GitHub Repo:
https://github.com/OpenNMT/CTranslate2

Classifiers

  • Scientific/Engineering/Artificial Intelligence
No  ctranslate2  pypi packages just yet.

Errors

A list of common ctranslate2 errors.

Code Examples

Here are some ctranslate2 code examples and snippets.

GitHub Issues

The ctranslate2 package has 120 open issues on GitHub

  • The accuracy of model improved after quantized with ct2 in 8bit
  • Accept left offsets when applying position encodings
  • distilbert-base-uncased-mnli
  • Accept left offsets in the rotary embeddings layer
  • Accept left offsets in the masked softmax operator
  • Request for Implementing Support for wav2vec2, MMS, and XLS-R Models
  • Split converted model.bin into multiple .bin
  • Support left padding to forward batch prompts in a single step
  • A keyerror is raised when using the FALCON 40B model converted by ctranslate2
  • CPP inference Error. ** Error in `./run': double free or corruption (!prev):
  • Get encoding from flan T5
  • Continuous batching
  • Code for chat inference server
  • Exception when exporting bloomz model
  • [Feature] support PagedAttention in cuda attention.cc

See more issues on GitHub

Related Packages & Articles

flwr 1.8.0

Flower: A Friendly Federated Learning Framework

vosk 0.3.45

Offline open source speech recognition API based on Kaldi and Vosk

torchsde 0.2.6

SDE solvers and stochastic adjoint sensitivity analysis in PyTorch.

optimum 1.18.0

Optimum Library is an extension of the Hugging Face Transformers library, providing a framework to integrate third-party libraries from Hardware Partners and interface with their specific functionality.

nncf 2.9.0

Neural Networks Compression Framework