All Projects → zbrookle → dataframe_sql

zbrookle / dataframe_sql

Licence: BSD-3-Clause license
A Python package that parses SQL and interprets it as methods that act upon existing pandas (or other types of) DataFrames that have been declared and registered

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to dataframe sql

dh-core
Functional data science
Stars: ✭ 123 (+29.47%)
Mutual labels:  dataframes
ElasticBatch
Elasticsearch tool for easily collecting and batch inserting Python data and pandas DataFrames
Stars: ✭ 21 (-77.89%)
Mutual labels:  dataframes
DataFrames
Welcome to DataFrames.jl with Bogumił Kamiński
Stars: ✭ 106 (+11.58%)
Mutual labels:  dataframes
isarn-sketches-spark
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (-70.53%)
Mutual labels:  dataframes
jun
JUN - python pandas, plotly, seaborn support & dataframes manipulation over erlang
Stars: ✭ 21 (-77.89%)
Mutual labels:  dataframes
polars
Fast multi-threaded DataFrame library in Rust | Python | Node.js
Stars: ✭ 6,368 (+6603.16%)
Mutual labels:  dataframes
datatile
A library for managing, validating, summarizing, and visualizing data.
Stars: ✭ 419 (+341.05%)
Mutual labels:  dataframes
Movies-Analytics-in-Spark-and-Scala
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Stars: ✭ 47 (-50.53%)
Mutual labels:  dataframes
woodwork
Woodwork is a Python library that provides robust methods for managing and communicating data typing information.
Stars: ✭ 97 (+2.11%)
Mutual labels:  dataframes
pandas-workshop
An introductory workshop on pandas with notebooks and exercises for following along.
Stars: ✭ 161 (+69.47%)
Mutual labels:  dataframes
heidi
heidi : tidy data in Haskell
Stars: ✭ 24 (-74.74%)
Mutual labels:  dataframes
DataTables.jl
(DEPRECATED) A rewrite of DataFrames.jl based on Nullable
Stars: ✭ 28 (-70.53%)
Mutual labels:  dataframes

dataframe_sql

https://github.com/zbrookle/dataframe_sql/workflows/CI/badge.svg?branch=master https://pepy.tech/badge/dataframe-sql

dataframe_sql is a Python package that translates SQL syntax into operations on pandas DataFrames, a functionality which is not available in the central pandas package.

Installation

pip install dataframe_sql

Usage

In this simple example, a DataFrame is read in from a csv and then using the query function you can produce a new DataFrame from the sql query.

from pandas import read_csv
from dataframe_sql import register_temp_table, query

my_table = read_csv("some_file.csv")

register_temp_table(my_table, "my_table")

query("""select * from my_table""")

The package currently only supports pandas but there are plans to support dask, rapids, and modin in the future.

SQL Syntax

The SQL syntax for dataframe_sql is exactly the same as the syntax in sql_to_ibis, its underlying package.

You can find the full SQL syntax here

Why use dataframe_sql?

While there are other packages that accomplish the goal of using SQL with pandas DataFrames, all other packages such as pandasql actually use a database on the backend which defeats the purpose of using pandas to begin with. In the case of pandasql which uses SQLite, this can result in major performance bottlenecks. dataframe_sql actually performs native pandas operations in memory on DataFrames, which avoids conflicts that may arise from using external databases.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].