ctranslate2 4.1.0
0
Fast inference engine for Transformer models
Contents
Fast inference engine for Transformer models
Stars: 2737, Watchers: 2737, Forks: 240, Open Issues: 120The OpenNMT/CTranslate2
repo was created 4 years ago and the last code push was 6 hours ago.
The project is very popular with an impressive 2737 github stars!
How to Install ctranslate2
You can install ctranslate2 using pip
pip install ctranslate2
or add it to a project with poetry
poetry add ctranslate2
Package Details
- Author
- OpenNMT
- License
- MIT
- Homepage
- https://opennmt.net
- PyPi:
- https://pypi.org/project/ctranslate2/
- Documentation:
- https://opennmt.net/CTranslate2
- GitHub Repo:
- https://github.com/OpenNMT/CTranslate2
Classifiers
- Scientific/Engineering/Artificial Intelligence
Related Packages
Errors
A list of common ctranslate2 errors.
Code Examples
Here are some ctranslate2
code examples and snippets.
GitHub Issues
The ctranslate2 package has 120 open issues on GitHub
- The accuracy of model improved after quantized with ct2 in 8bit
- Accept left offsets when applying position encodings
- distilbert-base-uncased-mnli
- Accept left offsets in the rotary embeddings layer
- Accept left offsets in the masked softmax operator
- Request for Implementing Support for wav2vec2, MMS, and XLS-R Models
- Split converted model.bin into multiple .bin
- Support left padding to forward batch prompts in a single step
- A keyerror is raised when using the FALCON 40B model converted by ctranslate2
- CPP inference Error. ** Error in `./run': double free or corruption (!prev):
- Get encoding from flan T5
- Continuous batching
- Code for chat inference server
- Exception when exporting bloomz model
- [Feature] support PagedAttention in cuda attention.cc