Python bindings and extensions for Velox

How to Install pyvelox

You can install pyvelox using pip

pip install pyvelox

or add it to a project with poetry

poetry add pyvelox

GitHub Issues

The pyvelox package has 937 open issues on GitHub

  • Fix handling of exceptions in try_cast from Json
  • Parquet row group prefetching doesn't work properly
  • Add date_add Spark function
  • Fix integer divide by zero when reading orc file using velox
  • [GLUTEN] Support rand SparkSQL function with seed specified
  • Cast(string as integer) behaves differently from SparkSQL with floating point input
  • Cast(string as integer) behaves differently from Presto and Spark on a corner case
  • Add ability to generate nested lazy children for RowVectors
  • Fix unnest for array of rows
  • Only verify unchanged input in verifier when there are no lazy inputs
  • Ensure encoding layers are initialized when loading lazy vectors
  • Consolidate LZ4 and LZ0 decompression of Parquet to dwio::common
  • Fix getByteRange when it encounters an invalid UTF code point.
  • Add shrinkPool API to MemoryManager/MemoryArbitrator
  • feat: PyVelox implementation for Array Vector

