vaex 4.17.0

Out-of-Core DataFrames to visualize and explore big tabular datasets

08-23-2023 239 words 2 minutes 0 views

Contents

Out-of-Core DataFrames to visualize and explore big tabular datasets

Stars: 8279, Watchers: 8279, Forks: 590, Open Issues: 533

The vaexio/vaex repo was created 10 years ago and the last code push was 4 days ago.
The project is extremely popular with a mindblowing 8279 github stars!

How to Install vaex

You can install vaex using pip

pip install vaex

or add it to a project with poetry

poetry add vaex

Package Details

Author: Maarten A. Breddels
License: MIT
Homepage: https://www.github.com/vaexio/vaex
PyPi:: https://pypi.org/project/vaex/
GitHub Repo:: https://github.com/vaexio/vaex

No vaex pypi packages just yet.

Errors

A list of common vaex errors.

Code Examples

Here are some vaex code examples and snippets.

GitHub Issues

The vaex package has 533 open issues on GitHub

[BUG-REPORT] converting massive CSV (50GB) stalls
[BUG-REPORT] AttributeError: 'ProgressBar' object has no attribute 'stime0'
vx.from_pandas(df).export_hdf5(path) giving KeyError while writing pandas df to HDF5 file.
DataFrame.max returning array containing -inf values
Issue on page /tutorial_jupyter.html
[BUG-REPORT] PydanticImportError: BaseSettings has been moved
[BUG-REPORT] AssertionError while performing math operation on shifted columns
Fixes #2350 Implementing take function in Vaex for first n colums
fix bug : open csv file use delimiter other than comma。
[Bug Fix] Broken graphQL query comparisons
Interactive widget fix
dont use take with arrow
Build aarch64 wheels and support python 3.11
fix typos in the learn more about vex section from the README file
Fix: evaluate iterator when selection=True

See more issues on GitHub

Related Packages & Articles

ludwig 0.10.4

Declarative machine learning: End-to-end machine learning pipelines using data-driven configurations.

PyOptimus is a Python library that brings together the power of various data processing engines like Pandas, Dask, cuDF, Dask-cuDF, Vaex, and PySpark under a single, easy-to-use API. It offers over 100 functions for data cleaning and processing, including handling strings, processing dates, URLs, and emails. PyOptimus also provides out-of-the-box functions for data exploration and quality fixing. One of the key features of PyOptimus is its ability to handle large datasets efficiently, allowing you to use the same code to process data on your laptop or on a remote cluster of GPUs.