ocrmypdf 16.5.0
0
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Contents
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Stars: 13849, Watchers: 13849, Forks: 1007, Open Issues: 114The ocrmypdf/OCRmyPDF
repo was created 10 years ago and the last code push was 3 weeks ago.
The project is extremely popular with a mindblowing 13849 github stars!
How to Install ocrmypdf
You can install ocrmypdf using pip
pip install ocrmypdf
or add it to a project with poetry
poetry add ocrmypdf
Package Details
- Author
- None
- License
- MPL-2.0
- Homepage
- None
- PyPi:
- https://pypi.org/project/ocrmypdf/
- Documentation:
- https://ocrmypdf.readthedocs.io/
- GitHub Repo:
- https://github.com/ocrmypdf/OCRmyPDF
Classifiers
- Scientific/Engineering/Image Recognition
- Text Processing/Indexing
- Text Processing/Linguistic
Related Packages
Errors
A list of common ocrmypdf errors.
Code Examples
Here are some ocrmypdf
code examples and snippets.
GitHub Issues
The ocrmypdf package has 114 open issues on GitHub
- [Feature]: Switch to remove images?
- [Bug]:
pdfa-image-compression=auto
behaviour violates the principle of least surprise w.r.t. lossy/lossless optimisations - Confused about –unpaper-args
- [Bug]: PDF/A-3B files generated with a widely used commercial encoder generate garbage OCR content
- [Feature]: OCR on pages with multiple text rotations
- 鉴于很多使用者不会配置环境,我们在OCRmyPDF的基础上,集成了所需环境,并使用Electron开发了桌面端 [Electron version of OCRmyPDF]
- [BUG] Frequently seeing
Syntax Error (91811): Too few (2) args to 'cm' operator
- [BUG] 'DecompressionBombError' on a ACM PDF - need resolution limit on high DPI
- [BUG] Bold font in PDF is replaced by black bars
- [BUG] ghostscript fails due to small resolution value
- Snap package shouldn't ship all of the Tesseract OCR language files
- Only generate text files without generating PDF files
- Feature Request: GPU OCR pipeline e.g. via EasyOCR
- extra space in the result pdf when the input pdf is in Chinese
- Azure ocr with ocrmypdf