Contents

donut-python 1.0.9

0

OCR-free Document Understanding Transformer

OCR-free Document Understanding Transformer

Stars: 5320, Watchers: 5320, Forks: 421, Open Issues: 180

The clovaai/donut repo was created 1 years ago and the last code push was 5 months ago.
The project is extremely popular with a mindblowing 5320 github stars!

How to Install donut-python

You can install donut-python using pip

pip install donut-python

or add it to a project with poetry

poetry add donut-python

Package Details

Author
Geewook Kim, Teakgyu Hong, Moonbin Yim, JeongYeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park
License
MIT
Homepage
https://github.com/clovaai/donut
PyPi:
https://pypi.org/project/donut-python/
GitHub Repo:
https://github.com/clovaai/donut

Classifiers

  • Scientific/Engineering/Artificial Intelligence
  • Software Development/Libraries
  • Software Development/Libraries/Python Modules
No  donut-python  pypi packages just yet.

Errors

A list of common donut-python errors.

Code Examples

Here are some donut-python code examples and snippets.

GitHub Issues

The donut-python package has 180 open issues on GitHub

  • Issue while training with custom decoder and tokenizer
  • RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasGemmStridedBatchedExFix
  • No answer in docVQA
  • Error in running sample colab - "Something went wrong Unexpected end of JSON input"
  • How to generate own dataset in parquet format, just like the data format given by readme
  • Are the parameters of swinTransformer trained during fine-tuning?
  • SwinTransformer' object has no attribute 'pos_drop'
  • size mismatch for encoder.model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
  • Backslash Rendered as Boxes with Cross Inside
  • Data Extraction Fine Tuning
  • base donut model keeps producing repeated characters
  • Synthdog generates special symbols
  • receipt scanning - issue with same line items
  • Neither of the DocVQA Task1 (Document VQA) demos work
  • What are some available prompts

See more issues on GitHub

Related Packages & Articles

ktrain 0.41.3

ktrain is a wrapper for TensorFlow Keras that makes deep learning and AI more accessible and easier to apply

datasets 2.19.0

HuggingFace community-driven open-source library of datasets

nlp 0.4.0

HuggingFace/NLP is an open library of NLP datasets.