All Projects β†’ Kartothek β†’ Similar Projects or Alternatives

131 Open source projects that are alternatives of or similar to Kartothek

Roapi
Create full-fledged APIs for static datasets without writing a single line of code.
Stars: ✭ 253 (+75.69%)
Mutual labels:  parquet, arrow
Vscode Data Preview
Data Preview 🈸 extension for importing πŸ“€ viewing πŸ”Ž slicing πŸ”ͺ dicing 🎲 charting πŸ“Š & exporting πŸ“₯ large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Stars: ✭ 245 (+70.14%)
Mutual labels:  parquet, arrow
Awkward 0.x
Manipulate arrays of complex data structures as easily as Numpy.
Stars: ✭ 216 (+50%)
Mutual labels:  parquet, arrow
graphique
GraphQL service for arrow tables and parquet data sets.
Stars: ✭ 28 (-80.56%)
Mutual labels:  arrow, parquet
Arrow.jl
Pure Julia implementation of the apache arrow data format (https://arrow.apache.org/)
Stars: ✭ 92 (-36.11%)
Mutual labels:  arrow
Android Expandicon
Nice and simple customizable implementation of Google style up/down expand arrow.
Stars: ✭ 871 (+504.86%)
Mutual labels:  arrow
Parquet Format
Apache Parquet
Stars: ✭ 800 (+455.56%)
Mutual labels:  parquet
Arrow
Ξ›rrow - Functional companion to Kotlin's Standard Library
Stars: ✭ 4,771 (+3213.19%)
Mutual labels:  arrow
Leader Line
Draw a leader line in your web page.
Stars: ✭ 1,872 (+1200%)
Mutual labels:  arrow
Sparksql Protobuf
Read SparkSQL parquet file as RDD[Protobuf]
Stars: ✭ 82 (-43.06%)
Mutual labels:  parquet
Cudf
cuDF - GPU DataFrame Library
Stars: ✭ 4,370 (+2934.72%)
Mutual labels:  arrow
Pucket
Bucketing and partitioning system for Parquet
Stars: ✭ 29 (-79.86%)
Mutual labels:  parquet
Schemer
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-32.64%)
Mutual labels:  parquet
Array Api
RFC document, tooling and other content related to the array API standard
Stars: ✭ 26 (-81.94%)
Mutual labels:  pydata
Amazon S3 Find And Forget
Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)
Stars: ✭ 115 (-20.14%)
Mutual labels:  parquet
Pyjanitor
Clean APIs for data cleaning. Python implementation of R package Janitor
Stars: ✭ 647 (+349.31%)
Mutual labels:  pydata
Parquet Mr
Apache Parquet
Stars: ✭ 1,278 (+787.5%)
Mutual labels:  parquet
Iceberg
Iceberg is a table format for large, slow-moving tabular data
Stars: ✭ 393 (+172.92%)
Mutual labels:  parquet
Pydata Chicago2016 Ml Tutorial
Machine learning with scikit-learn tutorial at PyData Chicago 2016
Stars: ✭ 128 (-11.11%)
Mutual labels:  pydata
Petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Stars: ✭ 1,108 (+669.44%)
Mutual labels:  parquet
Arrow
🏹 Parse JSON with style
Stars: ✭ 355 (+146.53%)
Mutual labels:  arrow
Parquet Cpp
Apache Parquet
Stars: ✭ 339 (+135.42%)
Mutual labels:  parquet
Pystore
Fast data store for Pandas time-series data
Stars: ✭ 325 (+125.69%)
Mutual labels:  parquet
Parquet Index
Spark SQL index for Parquet tables
Stars: ✭ 109 (-24.31%)
Mutual labels:  parquet
Rumble
β›ˆοΈ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-59.72%)
Mutual labels:  parquet
Elasticsearch loader
A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Stars: ✭ 300 (+108.33%)
Mutual labels:  parquet
Pre Short Closures
Stars: ✭ 36 (-75%)
Mutual labels:  arrow
Kglab
Graph-Based Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, RDFlib, pySHACL, RAPIDS, NetworkX, iGraph, PyVis, pslpython, pyarrow, etc.
Stars: ✭ 98 (-31.94%)
Mutual labels:  parquet
Pydata.kr
PyData Korea 곡식 ν™ˆνŽ˜μ΄μ§€μž…λ‹ˆλ‹€. (쀀비쀑)
Stars: ✭ 13 (-90.97%)
Mutual labels:  pydata
Drill
Apache Drill is a distributed MPP query layer for self describing data
Stars: ✭ 1,619 (+1024.31%)
Mutual labels:  parquet
Neural Image Captioning
Implementation of Neural Image Captioning model using Keras with Theano backend
Stars: ✭ 12 (-91.67%)
Mutual labels:  pydata
Pyvtreat
vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.
Stars: ✭ 92 (-36.11%)
Mutual labels:  pydata
Parquet Generator
Parquet file generator
Stars: ✭ 16 (-88.89%)
Mutual labels:  parquet
Gaffer
A large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+1040.28%)
Mutual labels:  parquet
React Archer
🏹 Draw arrows between React elements πŸ–‹
Stars: ✭ 666 (+362.5%)
Mutual labels:  arrow
Open Arrow
Open Arrow is an open-source font that contains 112 arrow symbols from U+2190 to U+21ff
Stars: ✭ 89 (-38.19%)
Mutual labels:  arrow
Datafusion
DataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (+324.31%)
Mutual labels:  arrow
Parquet Go
Go package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.
Stars: ✭ 114 (-20.83%)
Mutual labels:  parquet
Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+181.94%)
Mutual labels:  parquet
Bigdata File Viewer
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (-40.28%)
Mutual labels:  parquet
Skale
High performance distributed data processing engine
Stars: ✭ 390 (+170.83%)
Mutual labels:  parquet
Eel Sdk
Big Data Toolkit for the JVM
Stars: ✭ 140 (-2.78%)
Mutual labels:  parquet
Choetl
ETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+158.33%)
Mutual labels:  parquet
Distributed
A distributed task scheduler for Dask
Stars: ✭ 1,168 (+711.11%)
Mutual labels:  pydata
Oap
Optimized Analytics Package for Spark* Platform
Stars: ✭ 343 (+138.19%)
Mutual labels:  parquet
Blazingsql
BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
Stars: ✭ 1,652 (+1047.22%)
Mutual labels:  arrow
Arrows
Arrows is an animated custom view to give feedback about your UI sliding panels.
Stars: ✭ 338 (+134.72%)
Mutual labels:  arrow
Dask
Parallel computing with task scheduling
Stars: ✭ 9,309 (+6364.58%)
Mutual labels:  pydata
Datascience Anthology Pydata
PyData, The Complete Works of
Stars: ✭ 301 (+109.03%)
Mutual labels:  pydata
Parquet4s
Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Stars: ✭ 125 (-13.19%)
Mutual labels:  parquet
Gcs Tools
GCS support for avro-tools, parquet-tools and protobuf
Stars: ✭ 57 (-60.42%)
Mutual labels:  parquet
Ratatool
A tool for data sampling, data generation, and data diffing
Stars: ✭ 279 (+93.75%)
Mutual labels:  parquet
Parquet Dotnet
🏐 Apache Parquet for modern .NET
Stars: ✭ 276 (+91.67%)
Mutual labels:  parquet
Pymapd
Python client for OmniSci GPU-accelerated SQL engine and analytics platform
Stars: ✭ 109 (-24.31%)
Mutual labels:  pydata
Node Parquet
NodeJS module to access apache parquet format files
Stars: ✭ 46 (-68.06%)
Mutual labels:  parquet
Sketch Connection Flow Arrows
Plugin for generating easy to use connection flow arrows in Sketch
Stars: ✭ 275 (+90.97%)
Mutual labels:  arrow
ml at awslambda pydatabln2018
Material for working alongside my workshop session at PyData Berlin 2018
Stars: ✭ 18 (-87.5%)
Mutual labels:  pydata
Arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
Stars: ✭ 8,828 (+6030.56%)
Mutual labels:  arrow
Stumpy
STUMPY is a powerful and scalable Python library for modern time series analysis
Stars: ✭ 2,019 (+1302.08%)
Mutual labels:  pydata
Scattertext Pydata
Notebooks for the Seattle PyData 2017 talk on Scattertext
Stars: ✭ 132 (-8.33%)
Mutual labels:  pydata
1-60 of 131 similar projects