ctranslate2 4.4.0
0
Fast inference engine for Transformer models
Contents
Fast inference engine for Transformer models
Stars: 3303, Watchers: 3303, Forks: 289, Open Issues: 192The OpenNMT/CTranslate2
repo was created 5 years ago and the last code push was 2 days ago.
The project is very popular with an impressive 3303 github stars!
How to Install ctranslate2
You can install ctranslate2 using pip
pip install ctranslate2
or add it to a project with poetry
poetry add ctranslate2
Package Details
- Author
- OpenNMT
- License
- MIT
- Homepage
- https://opennmt.net
- PyPi:
- https://pypi.org/project/ctranslate2/
- Documentation:
- https://opennmt.net/CTranslate2
- GitHub Repo:
- https://github.com/OpenNMT/CTranslate2
Classifiers
- Scientific/Engineering/Artificial Intelligence
Related Packages
Errors
A list of common ctranslate2 errors.
Code Examples
Here are some ctranslate2
code examples and snippets.
GitHub Issues
The ctranslate2 package has 192 open issues on GitHub
- The accuracy of model improved after quantized with ct2 in 8bit
- Accept left offsets when applying position encodings
- distilbert-base-uncased-mnli
- Accept left offsets in the rotary embeddings layer
- Accept left offsets in the masked softmax operator
- Request for Implementing Support for wav2vec2, MMS, and XLS-R Models
- Split converted model.bin into multiple .bin
- Support left padding to forward batch prompts in a single step
- A keyerror is raised when using the FALCON 40B model converted by ctranslate2
- CPP inference Error. ** Error in `./run': double free or corruption (!prev):
- Get encoding from flan T5
- Continuous batching
- Code for chat inference server
- Exception when exporting bloomz model
- [Feature] support PagedAttention in cuda attention.cc