ctranslate2 4.3.1


Fast inference engine for Transformer models

Stars: 3077, Watchers: 3077, Forks: 273, Open Issues: 161

The OpenNMT/CTranslate2 repo was created 4 years ago and the last code push was 1 weeks ago.
The project is very popular with an impressive 3077 github stars!

How to Install ctranslate2

You can install ctranslate2 using pip

pip install ctranslate2

or add it to a project with poetry

poetry add ctranslate2

Package Details

  • Scientific/Engineering/Artificial Intelligence
Code Examples

GitHub Issues

The ctranslate2 package has 161 open issues on GitHub

  • The accuracy of model improved after quantized with ct2 in 8bit
  • Accept left offsets when applying position encodings
  • distilbert-base-uncased-mnli
  • Accept left offsets in the rotary embeddings layer
  • Accept left offsets in the masked softmax operator
  • Request for Implementing Support for wav2vec2, MMS, and XLS-R Models
  • Split converted model.bin into multiple .bin
  • Support left padding to forward batch prompts in a single step
  • A keyerror is raised when using the FALCON 40B model converted by ctranslate2
  • CPP inference Error. ** Error in `./run': double free or corruption (!prev):
  • Get encoding from flan T5
  • Continuous batching
  • Code for chat inference server
  • Exception when exporting bloomz model
  • [Feature] support PagedAttention in cuda

