Contents

pyoptimus 23.5.0b0

0

Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion.

PyOptimus is a Python library that brings together the power of various data processing engines like Pandas, Dask, cuDF, Dask-cuDF, Vaex, and PySpark under a single, easy-to-use API. It offers over 100 functions for data cleaning and processing, including handling strings, processing dates, URLs, and emails. PyOptimus also provides out-of-the-box functions for data exploration and quality fixing. One of the key features of PyOptimus is its ability to handle large datasets efficiently, allowing you to use the same code to process data on your laptop or on a remote cluster of GPUs.

Stars: 1439, Watchers: 1439, Forks: 233, Open Issues: 29

The hi-primus/optimus repo was created 6 years ago and the last code push was 1 weeks ago.
The project is very popular with an impressive 1439 github stars!

How to Install pyoptimus

You can install pyoptimus using pip

pip install pyoptimus

or add it to a project with poetry

poetry add pyoptimus

Package Details

Author
Argenis Leon
License
APACHE
Homepage
https://github.com/hi-primus/optimus/
PyPi:
https://pypi.org/project/pyoptimus/
GitHub Repo:
https://github.com/hi-primus/optimus

Classifiers

  • Scientific/Engineering/Artificial Intelligence
No  pyoptimus  pypi packages just yet.

Errors

A list of common pyoptimus errors.

Code Examples

Here are some pyoptimus code examples and snippets.

GitHub Issues

The pyoptimus package has 29 open issues on GitHub

  • Scheduled biweekly dependency update for week 29

See more issues on GitHub

Related Packages & Articles

optimuspyspark 2.2.32

Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion with pyspark.

sweetviz 2.3.1

A pandas-based library to visualize and compare datasets.

mage-ai 0.9.68

Mage is a tool for building and deploying data pipelines.

fiftyone 0.23.7

FiftyOne: the open-source tool for building high-quality datasets and computer vision models

gradio 4.26.0

Python library for easily interacting with trained machine learning models