All Projects → otsaloma → dataiter

otsaloma / dataiter

Licence: MIT license
Python classes for data manipulation

Programming Languages

python
139335 projects - #7 most used programming language
r
7636 projects

Projects that are alternatives of or similar to dataiter

Nanny
A tidyverse suite for (pre-) machine-learning: cluster, PCA, permute, impute, rotate, redundancy, triangular, smart-subset, abundant and variable features.
Stars: ✭ 17 (-32%)
Mutual labels:  data-frame
Tablesaw
Java dataframe and visualization library
Stars: ✭ 2,785 (+11040%)
Mutual labels:  data-frame
BenchmarksPythonJuliaAndCo
Benchmark(s) of numerical programs with Python (and Scipy, Pythran, Numba), Julia and C++.
Stars: ✭ 19 (-24%)
Mutual labels:  numba
Quandl Python
Stars: ✭ 1,076 (+4204%)
Mutual labels:  data-frame
Apache Spark Node
Node.js bindings for Apache Spark DataFrame APIs
Stars: ✭ 136 (+444%)
Mutual labels:  data-frame
warp-drive
Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning Framework on a GPU (JMLR 2022)
Stars: ✭ 364 (+1356%)
Mutual labels:  numba
Joinery
Data frames for Java
Stars: ✭ 526 (+2004%)
Mutual labels:  data-frame
codex-africanus
Radio Astronomy Algorithms Library
Stars: ✭ 13 (-48%)
Mutual labels:  numba
Spark Excel
A Spark plugin for reading Excel files via Apache POI
Stars: ✭ 216 (+764%)
Mutual labels:  data-frame
scipy-crash-course
Material for a 24 hours course on Scientific Python
Stars: ✭ 98 (+292%)
Mutual labels:  numba
Spark Bigquery
Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
Stars: ✭ 65 (+160%)
Mutual labels:  data-frame
Gdeltpyr
Python based framework to retreive Global Database of Events, Language, and Tone (GDELT) version 1.0 and version 2.0 data.
Stars: ✭ 124 (+396%)
Mutual labels:  data-frame
ruck
🧬 Modularised Evolutionary Algorithms For Python with Optional JIT and Multiprocessing (Ray) support. Inspired by PyTorch Lightning
Stars: ✭ 50 (+100%)
Mutual labels:  numba
Dataframes.jl
In-memory tabular data in Julia
Stars: ✭ 951 (+3704%)
Mutual labels:  data-frame
qgs
A 2-layer quasi-geostrophic atmospheric model in Python. Can be coupled to a simple land or shallow-water ocean component.
Stars: ✭ 24 (-4%)
Mutual labels:  numba
Data Science Your Way
Ways of doing Data Science Engineering and Machine Learning in R and Python
Stars: ✭ 530 (+2020%)
Mutual labels:  data-frame
Styleframe
A library that wraps pandas and openpyxl and allows easy styling of dataframes in excel
Stars: ✭ 252 (+908%)
Mutual labels:  data-frame
QUB DW HighPerformancePython
Code and more for the QUB Development Weeks event 'High Performance Python'
Stars: ✭ 79 (+216%)
Mutual labels:  numba
daany
Daany - .NET DAta ANalYtics .NET library with the implementation of DataFrame, Time series decompositions and Linear Algebra routines BLASS and LAPACK.
Stars: ✭ 49 (+96%)
Mutual labels:  data-frame
skan
Python module to analyse skeleton (thin object) images
Stars: ✭ 92 (+268%)
Mutual labels:  numba

Python Classes for Data Manipulation

Test Documentation Status PyPI Downloads

Dataiter currently includes the following classes.

DataFrame is a class for tabular data similar to R's data.frame or pandas.DataFrame. It is under the hood a dictionary of NumPy arrays and thus capable of fast vectorized operations. You can consider this to be a light-weight alternative to Pandas with a simple and consistent API. Performance-wise Dataiter relies on NumPy and Numba and is likely to be at best comparable to Pandas.

ListOfDicts is a class useful for manipulating data from JSON APIs. It provides functionality similar to libraries such as Underscore.js, with manipulation functions that iterate over the data and return a shallow modified copy of the original. attd.AttributeDict is used to provide convenient access to dictionary keys.

GeoJSON is a simple wrapper class that allows reading a GeoJSON file into a DataFrame and writing a data frame to a GeoJSON file. Any operations on the data are thus done with methods provided by the data frame class. Geometry is read as-is into the "geometry" column, but no special geometric operations are currently supported.

Installation

# Latest stable version
pip install -U dataiter

# Latest development version
pip install -U git+https://github.com/otsaloma/dataiter

# Numba (optional)
pip install -U numba

Dataiter optionally uses Numba to speed up certain operations. If you have Numba installed and importing it succeeds, Dataiter will use it automatically. It's currently not a hard dependency, so you need to install it separately.

Documentation

https://dataiter.readthedocs.io/

If you're familiar with either dplyr (R) or Pandas (Python), the comparison table in the documentation will give you a quick overview of the differences and similarities.

https://dataiter.readthedocs.io/en/latest/comparison.html

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].