pandasai 0.8.2


Pandas AI is a Python library that integrates generative artificial intelligence capabilities into P

Pandas AI is a Python library that integrates generative artificial intelligence capabilities into Pandas, making dataframes conversational.

Stars: 8070, Watchers: 8070, Forks: 601, Open Issues: 74

The gventuri/pandas-ai repo was created 3 months ago and the last code push was 12 hours ago.
The project is extremely popular with a mindblowing 8070 github stars!

How to Install pandasai

You can install pandasai using pip

pip install pandasai

or add it to a project with poetry

poetry add pandasai

Package Details

Gabriele Venturi
GitHub Repo:


No  pandasai  pypi packages just yet.


A list of common pandasai errors.

Code Examples

Here are some pandasai code examples and snippets.

GitHub Issues

The pandasai package has 74 open issues on GitHub

  • Support Llama v2 and Text generation inference
  • Try to import a package when NameError is raised during execution
  • Token Limits with 420 column and 119546 rows
  • Clean_data and Impute_missing_values not working as expected
  • Add support for Azure, OpenAI, Palm, Anthropic, Cohere Models - using litellm
  • Plotted graph has overlapping labels
  • Added Poe-api as LLM reference
  • Database adapters
  • fix: environment for executing code
  • Raise "TypeError" when trying to save cache
  • Semantic search for previously asked questions
  • Arbitrary file read and arbitrary file write by prompt injection
  • hello, why did the code like "df = df[df['content'].str.contains('xxxx')] " didn't work?
  • The fix of #issue399 (RCE from prompt) can be bypassed.
  • Can pandasai specific the vector db location ?

See more issues on GitHub

Related Packages & Articles

pyoptimus 23.5.0b0

PyOptimus is a Python library that brings together the power of various data processing engines like Pandas, Dask, cuDF, Dask-cuDF, Vaex, and PySpark under a single, easy-to-use API. It offers over 100 functions for data cleaning and processing, including handling strings, processing dates, URLs, and emails. PyOptimus also provides out-of-the-box functions for data exploration and quality fixing. One of the key features of PyOptimus is its ability to handle large datasets efficiently, allowing you to use the same code to process data on your laptop or on a remote cluster of GPUs.

deeplake 3.6.14

Deep Lake is a Database for AI powered by a unique storage format optimized for deep-learning and Large Language Model (LLM) based applications. It simplifies the deployment of enterprise-grade LLM-based products by offering storage for all data types (embeddings, audio, text, videos, images, pdfs, annotations, etc.), querying and vector search, data streaming while training models at scale, data versioning and lineage for all workloads, and integrations with popular tools such as LangChain, LlamaIndex, Weights & Biases, and many more.

sweetviz 2.1.4

A pandas-based library to visualize and compare datasets.

dtale 3.3.0

Web Client for Visualizing Pandas Objects

pandas 2.0.3

Powerful data structures for data analysis, time series, and statistics

Random Data Generation & Data Visualization with Python

In this blog post, we’ll utilize the powerful libraries Matplotlib, Numpy and Pandas to perform data generation and visualization. We’ll discuss the programming concepts, methods, and functionalities used in this script.