pyoptimus 23.5.0b0
Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion.
PyOptimus is a Python library that brings together the power of various data processing engines like Pandas, Dask, cuDF, Dask-cuDF, Vaex, and PySpark under a single, easy-to-use API. It offers over 100 functions for data cleaning and processing, including handling strings, processing dates, URLs, and emails. PyOptimus also provides out-of-the-box functions for data exploration and quality fixing. One of the key features of PyOptimus is its ability to handle large datasets efficiently, allowing you to use the same code to process data on your laptop or on a remote cluster of GPUs.
Stars: 1478, Watchers: 1478, Forks: 232, Open Issues: 29The hi-primus/optimus
repo was created 7 years ago and the last code push was 6 days ago.
The project is very popular with an impressive 1478 github stars!
How to Install pyoptimus
You can install pyoptimus using pip
pip install pyoptimus
or add it to a project with poetry
poetry add pyoptimus
Package Details
- Author
- Argenis Leon
- License
- APACHE
- Homepage
- https://github.com/hi-primus/optimus/
- PyPi:
- https://pypi.org/project/pyoptimus/
- GitHub Repo:
- https://github.com/hi-primus/optimus
Classifiers
- Scientific/Engineering/Artificial Intelligence
Related Packages
Errors
A list of common pyoptimus errors.
Code Examples
Here are some pyoptimus
code examples and snippets.
GitHub Issues
The pyoptimus package has 29 open issues on GitHub
- Scheduled biweekly dependency update for week 29