All Projects → JuliaData → IndexedTables.jl

JuliaData / IndexedTables.jl

Licence: MIT license
Flexible tables with ordered indices

Programming Languages

julia
2034 projects

Projects that are alternatives of or similar to IndexedTables.jl

Cyberchef
The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis
Stars: ✭ 13,674 (+12561.11%)
Mutual labels:  data-analysis, data-manipulation
dask-awkward
Native Dask collection for awkward arrays, and the library to use it.
Stars: ✭ 25 (-76.85%)
Mutual labels:  data-analysis
Datscan
DatScan is an initiative to build an open-source CMS that will have the capability to solve any problem using data Analysis just with the help of various modules and a vast standardized module library
Stars: ✭ 13 (-87.96%)
Mutual labels:  data-analysis
hnn
The Human Neocortical Neurosolver (HNN) is a software tool that gives researchers/clinicians the ability to develop/test hypotheses on circuit mechanisms underlying EEG/MEG data.
Stars: ✭ 62 (-42.59%)
Mutual labels:  data-analysis
Data-Science-Resources
A guide to getting started with Data Science and ML.
Stars: ✭ 17 (-84.26%)
Mutual labels:  data-analysis
transbigdata
A Python package develop for transportation spatio-temporal big data processing, analysis and visualization.
Stars: ✭ 195 (+80.56%)
Mutual labels:  data-analysis
CC33Z
Curso de Ciência da Computação
Stars: ✭ 50 (-53.7%)
Mutual labels:  data-analysis
ipaddress
Data analysis of IP addresses and networks
Stars: ✭ 20 (-81.48%)
Mutual labels:  data-analysis
data-analysis
金融市场与体育彩券市场 --- 数据分析与量化交易
Stars: ✭ 73 (-32.41%)
Mutual labels:  data-analysis
copulae
Multivariate data modelling with Copulas in Python
Stars: ✭ 96 (-11.11%)
Mutual labels:  data-analysis
python-notebooks
A collection of Jupyter Notebooks used in conferences or just to have some snippets.
Stars: ✭ 14 (-87.04%)
Mutual labels:  data-analysis
nebula
A distributed block-based data storage and compute engine
Stars: ✭ 127 (+17.59%)
Mutual labels:  data-analysis
Table-Extractor-From-Image
This repository contains the code that extracts a table from an image and exports it to an Excel.
Stars: ✭ 46 (-57.41%)
Mutual labels:  data-manipulation
fastverse
An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R
Stars: ✭ 123 (+13.89%)
Mutual labels:  data-manipulation
dsr
Introduction to Data Science with R (2017)
Stars: ✭ 25 (-76.85%)
Mutual labels:  data-analysis
heidi
heidi : tidy data in Haskell
Stars: ✭ 24 (-77.78%)
Mutual labels:  data-analysis
spectrochempy
SpectroChemPy is a framework for processing, analyzing and modeling spectroscopic data for chemistry with Python
Stars: ✭ 34 (-68.52%)
Mutual labels:  data-analysis
pandas-workshop
An introductory workshop on pandas with notebooks and exercises for following along.
Stars: ✭ 161 (+49.07%)
Mutual labels:  data-analysis
architect big data solutions with spark
code, labs and lectures for the course
Stars: ✭ 40 (-62.96%)
Mutual labels:  data-analysis
ipychart
The power of Chart.js with Python
Stars: ✭ 48 (-55.56%)
Mutual labels:  data-analysis

IndexedTables.jl

CI codecov deps version pkgeval

IndexedTables provides tabular data structures where some of the columns form a sorted index. It provides the backend to JuliaDB, but can be used on its own for efficient in-memory data processing and analytics.

Data Structures

IndexedTables offers two data structures: IndexedTable and NDSparse.

  • Both types store data in columns.
  • IndexedTable and NDSparse differ mainly in how data is accessed.
  • Both types have equal performance for Table operations (select, filter, etc.).

Quickstart

using Pkg
Pkg.add("IndexedTables")
using IndexedTables

t = table((x = 1:100, y = randn(100)))

select(t, :x)

filter(row -> row.y > 0, t)

IndexedTable vs. NDSparse

First let's create some data to work with.

using Dates

city = vcat(fill("New York", 3), fill("Boston", 3))

dates = repeat(Date(2016,7,6):Day(1):Date(2016,7,8), 2)

vals = [91, 89, 91, 95, 83, 76]

IndexedTable

  • (Optionally) Sorted by primary key(s), pkey.
  • Data is accessed as a Vector of NamedTuples.
using IndexedTables

julia> t1 = table((city = city, dates = dates, values = vals); pkey = [:city, :dates])
Table with 6 rows, 3 columns:
city        dates       values
──────────────────────────────
"Boston"    2016-07-06  95
"Boston"    2016-07-07  83
"Boston"    2016-07-08  76
"New York"  2016-07-06  91
"New York"  2016-07-07  89
"New York"  2016-07-08  91

julia> t1[1]
(city = "Boston", dates = 2016-07-06, values = 95)

NDSparse

  • Sorted by index variables (first argument).
  • Data is accessed as an N-dimensional sparse array with arbitrary indexes.
julia> t2 = ndsparse((city=city, dates=dates), (value=vals,))
2-d NDSparse with 6 values (1 field named tuples):
city        dates      │ value
───────────────────────┼──────
"Boston"    2016-07-0695
"Boston"    2016-07-0783
"Boston"    2016-07-0876
"New York"  2016-07-0691
"New York"  2016-07-0789
"New York"  2016-07-0891

julia> t2["Boston", Date(2016, 7, 6)]
(value = 95)

Get started

For more information, check out the JuliaDB Documentation.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].