All Projects → Kartothek → Similar Projects or Alternatives

131 Open source projects that are alternatives of or similar to Kartothek

Create full-fledged APIs for static datasets without writing a single line of code.

Stars: ✭ 253 (+75.69%)

Mutual labels: parquet, arrow

Data Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files

Stars: ✭ 245 (+70.14%)

Mutual labels: parquet, arrow

Awkward 0.x

Manipulate arrays of complex data structures as easily as Numpy.

Stars: ✭ 216 (+50%)

Mutual labels: parquet, arrow

graphique

GraphQL service for arrow tables and parquet data sets.

Stars: ✭ 28 (-80.56%)

Mutual labels: arrow, parquet

Arrow.jl

Pure Julia implementation of the apache arrow data format (https://arrow.apache.org/)

Stars: ✭ 92 (-36.11%)

Mutual labels: arrow

Android Expandicon

Nice and simple customizable implementation of Google style up/down expand arrow.

Stars: ✭ 871 (+504.86%)

Mutual labels: arrow

Parquet Format

Apache Parquet

Stars: ✭ 800 (+455.56%)

Mutual labels: parquet

Arrow

Λrrow - Functional companion to Kotlin's Standard Library

Stars: ✭ 4,771 (+3213.19%)

Mutual labels: arrow

Leader Line

Draw a leader line in your web page.

Stars: ✭ 1,872 (+1200%)

Mutual labels: arrow

Sparksql Protobuf

Read SparkSQL parquet file as RDD[Protobuf]

Stars: ✭ 82 (-43.06%)

Mutual labels: parquet

Cudf

cuDF - GPU DataFrame Library

Stars: ✭ 4,370 (+2934.72%)

Mutual labels: arrow

Pucket

Bucketing and partitioning system for Parquet

Stars: ✭ 29 (-79.86%)

Mutual labels: parquet

Schemer

Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.

Stars: ✭ 97 (-32.64%)

Mutual labels: parquet

Array Api

RFC document, tooling and other content related to the array API standard

Stars: ✭ 26 (-81.94%)

Mutual labels: pydata

Amazon S3 Find And Forget

Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)

Stars: ✭ 115 (-20.14%)

Mutual labels: parquet

Pyjanitor

Clean APIs for data cleaning. Python implementation of R package Janitor

Stars: ✭ 647 (+349.31%)

Mutual labels: pydata

Parquet Mr

Apache Parquet

Stars: ✭ 1,278 (+787.5%)

Mutual labels: parquet

Iceberg

Iceberg is a table format for large, slow-moving tabular data

Stars: ✭ 393 (+172.92%)

Mutual labels: parquet

Pydata Chicago2016 Ml Tutorial

Machine learning with scikit-learn tutorial at PyData Chicago 2016

Stars: ✭ 128 (-11.11%)

Mutual labels: pydata

Petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

Stars: ✭ 1,108 (+669.44%)

Mutual labels: parquet

Arrow

🏹 Parse JSON with style

Stars: ✭ 355 (+146.53%)

Mutual labels: arrow

Parquet Cpp

Apache Parquet

Stars: ✭ 339 (+135.42%)

Mutual labels: parquet

Pystore

Fast data store for Pandas time-series data

Stars: ✭ 325 (+125.69%)

Mutual labels: parquet

Parquet Index

Spark SQL index for Parquet tables

Stars: ✭ 109 (-24.31%)

Mutual labels: parquet

Rumble

⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Stars: ✭ 58 (-59.72%)

Mutual labels: parquet

Elasticsearch loader

A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch

Stars: ✭ 300 (+108.33%)

Mutual labels: parquet

Pre Short Closures

Stars: ✭ 36 (-75%)

Mutual labels: arrow

Kglab

Graph-Based Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, RDFlib, pySHACL, RAPIDS, NetworkX, iGraph, PyVis, pslpython, pyarrow, etc.

Stars: ✭ 98 (-31.94%)

Mutual labels: parquet

Pydata.kr

PyData Korea 공식 홈페이지입니다. (준비중)

Stars: ✭ 13 (-90.97%)

Mutual labels: pydata

Drill

Apache Drill is a distributed MPP query layer for self describing data

Stars: ✭ 1,619 (+1024.31%)

Mutual labels: parquet

Neural Image Captioning

Implementation of Neural Image Captioning model using Keras with Theano backend

Stars: ✭ 12 (-91.67%)

Mutual labels: pydata

Pyvtreat

vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.

Stars: ✭ 92 (-36.11%)

Mutual labels: pydata

Parquet Generator

Parquet file generator

Stars: ✭ 16 (-88.89%)

Mutual labels: parquet

Gaffer

A large-scale entity and relation database supporting aggregation of properties

Stars: ✭ 1,642 (+1040.28%)

Mutual labels: parquet

React Archer

🏹 Draw arrows between React elements 🖋

Stars: ✭ 666 (+362.5%)

Mutual labels: arrow

Open Arrow

Open Arrow is an open-source font that contains 112 arrow symbols from U+2190 to U+21ff

Stars: ✭ 89 (-38.19%)

Mutual labels: arrow

Datafusion

DataFusion has now been donated to the Apache Arrow project

Stars: ✭ 611 (+324.31%)

Mutual labels: arrow

Parquet Go

Go package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.

Stars: ✭ 114 (-20.83%)

Mutual labels: parquet

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+181.94%)

Mutual labels: parquet

Bigdata File Viewer

A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.

Stars: ✭ 86 (-40.28%)

Mutual labels: parquet

Skale

High performance distributed data processing engine

Stars: ✭ 390 (+170.83%)

Mutual labels: parquet

Eel Sdk

Big Data Toolkit for the JVM

Stars: ✭ 140 (-2.78%)

Mutual labels: parquet

Choetl

ETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)