All Projects → Schemer → Similar Projects or Alternatives

1760 Open source projects that are alternatives of or similar to Schemer

Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-40.21%)
Mutual labels:  json, spark, avro, parquet
Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+318.56%)
Mutual labels:  json, spark, avro, parquet
Vscode Data Preview
Data Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Stars: ✭ 245 (+152.58%)
Mutual labels:  json, avro, parquet
Iceberg
Iceberg is a table format for large, slow-moving tabular data
Stars: ✭ 393 (+305.15%)
Mutual labels:  spark, avro, parquet
Choetl
ETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+283.51%)
Mutual labels:  json, avro, parquet
qwery
A SQL-like language for performing ETL transformations.
Stars: ✭ 28 (-71.13%)
Mutual labels:  tsv, avro
Structured Text Tools
A list of command line tools for manipulating structured text data
Stars: ✭ 6,180 (+6271.13%)
Mutual labels:  json, tsv
Sqlitebiter
A CLI tool to convert CSV / Excel / HTML / JSON / Jupyter Notebook / LDJSON / LTSV / Markdown / SQLite / SSV / TSV / Google-Sheets to a SQLite database file.
Stars: ✭ 601 (+519.59%)
Mutual labels:  json, tsv
Rq
Record Query - A tool for doing record analysis and transformation
Stars: ✭ 1,808 (+1763.92%)
Mutual labels:  json, avro
Kafka Connect Mongodb
**Unofficial / Community** Kafka Connect MongoDB Sink Connector - Find the official MongoDB Kafka Connector here: https://www.mongodb.com/kafka-connector
Stars: ✭ 137 (+41.24%)
Mutual labels:  json, avro
Miller
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
Stars: ✭ 4,633 (+4676.29%)
Mutual labels:  json, tsv
qsv
CSVs sliced, diced & analyzed.
Stars: ✭ 438 (+351.55%)
Mutual labels:  tsv, parquet
Parquet Generator
Parquet file generator
Stars: ✭ 16 (-83.51%)
Mutual labels:  spark, parquet
experiments
Code examples for my blog posts
Stars: ✭ 21 (-78.35%)
Mutual labels:  spark, parquet
parquet-extra
A collection of Apache Parquet add-on modules
Stars: ✭ 30 (-69.07%)
Mutual labels:  avro, parquet
columnify
Make record oriented data to columnar format.
Stars: ✭ 28 (-71.13%)
Mutual labels:  avro, parquet
Gaffer
A large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+1592.78%)
Mutual labels:  spark, parquet
Abris
Avro SerDe for Apache Spark structured APIs.
Stars: ✭ 130 (+34.02%)
Mutual labels:  spark, avro
Pxi
🧚 pxi (pixie) is a small, fast, and magical command-line data processor similar to jq, mlr, and awk.
Stars: ✭ 248 (+155.67%)
Mutual labels:  json, tsv
Schema Registry
Confluent Schema Registry for Kafka
Stars: ✭ 1,647 (+1597.94%)
Mutual labels:  json, avro
kafka-compose
🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (-67.01%)
Mutual labels:  spark, avro
confluent-spark-avro
Spark UDFs to deserialize Avro messages with schemas stored in Schema Registry.
Stars: ✭ 18 (-81.44%)
Mutual labels:  spark, avro
Sqawk
Like Awk but with SQL and table joins
Stars: ✭ 263 (+171.13%)
Mutual labels:  json, tsv
Ratatool
A tool for data sampling, data generation, and data diffing
Stars: ✭ 279 (+187.63%)
Mutual labels:  avro, parquet
Sq
swiss-army knife for data
Stars: ✭ 275 (+183.51%)
Mutual labels:  json, tsv
Oap
Optimized Analytics Package for Spark* Platform
Stars: ✭ 343 (+253.61%)
Mutual labels:  spark, parquet
Noproto
Flexible, Fast & Compact Serialization with RPC
Stars: ✭ 138 (+42.27%)
Mutual labels:  json, avro
Pmacct
pmacct is a small set of multi-purpose passive network monitoring tools [NetFlow IPFIX sFlow libpcap BGP BMP RPKI IGP Streaming Telemetry].
Stars: ✭ 677 (+597.94%)
Mutual labels:  json, avro
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-75.26%)
Mutual labels:  avro, parquet
Storagetapper
StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
Stars: ✭ 232 (+139.18%)
Mutual labels:  json, avro
Elasticsearch loader
A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Stars: ✭ 300 (+209.28%)
Mutual labels:  json, parquet
Kafka Storm Starter
Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (+650.52%)
Mutual labels:  spark, avro
Pucket
Bucketing and partitioning system for Parquet
Stars: ✭ 29 (-70.1%)
Mutual labels:  spark, parquet
Pytablewriter
pytablewriter is a Python library to write a table in various formats: CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.
Stars: ✭ 422 (+335.05%)
Mutual labels:  json, tsv
Parquet Index
Spark SQL index for Parquet tables
Stars: ✭ 109 (+12.37%)
Mutual labels:  spark, parquet
Bigdata Playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+82.47%)
Mutual labels:  avro, parquet
parquet-flinktacular
How to use Parquet in Flink
Stars: ✭ 29 (-70.1%)
Mutual labels:  avro, parquet
Visidata
A terminal spreadsheet multitool for discovering and arranging data
Stars: ✭ 4,606 (+4648.45%)
Mutual labels:  json, tsv
Gcs Tools
GCS support for avro-tools, parquet-tools and protobuf
Stars: ✭ 57 (-41.24%)
Mutual labels:  avro, parquet
Bigdata File Viewer
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (-11.34%)
Mutual labels:  avro, parquet
Jsonmapper
Map nested JSON structures onto PHP classes
Stars: ✭ 1,306 (+1246.39%)
Mutual labels:  json
Repository
个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-5.15%)
Mutual labels:  spark
Tabtoy
高性能表格数据导出器
Stars: ✭ 1,302 (+1242.27%)
Mutual labels:  json
Rust graphql api boilerplate
A Boilerplate of GraphQL API built in Rust + Warp + Juniper + Diesel
Stars: ✭ 91 (-6.19%)
Mutual labels:  graphql-api
Play Circe
circe for play
Stars: ✭ 95 (-2.06%)
Mutual labels:  json
Json Benchmark
nativejson-benchmark in Rust
Stars: ✭ 93 (-4.12%)
Mutual labels:  json
Catj
Displays JSON files in a flat format.
Stars: ✭ 1,301 (+1241.24%)
Mutual labels:  json
Tanka
Flexible, reusable and concise configuration for Kubernetes
Stars: ✭ 1,299 (+1239.18%)
Mutual labels:  json
Night Config
Powerful java configuration library for toml, yaml, hocon, json and in-memory configurations
Stars: ✭ 93 (-4.12%)
Mutual labels:  json
Kson
Gson TypeAdapter & Factory generator for Kotlin data classes
Stars: ✭ 90 (-7.22%)
Mutual labels:  json
Simdjson php
simdjson_php bindings for the simdjson project. https://github.com/lemire/simdjson
Stars: ✭ 90 (-7.22%)
Mutual labels:  json
Relation extraction
Relation Extraction using Deep learning(CNN)
Stars: ✭ 96 (-1.03%)
Mutual labels:  spark
Generic Json Swift
A simple Swift library for working with generic JSON structures
Stars: ✭ 95 (-2.06%)
Mutual labels:  json
Spark Summit 2017 Sanfrancisco
spark summit 2017 SanFrancisco
Stars: ✭ 93 (-4.12%)
Mutual labels:  spark
Importjsonapi
Use JSONPath to selectively extract data from any JSON or GraphQL API directly into Google Sheets.
Stars: ✭ 90 (-7.22%)
Mutual labels:  graphql-api
Summitdb
In-memory NoSQL database with ACID transactions, Raft consensus, and Redis API
Stars: ✭ 1,295 (+1235.05%)
Mutual labels:  json
Big Data
🔧 Use dplyr to analyze Big Data 🐘
Stars: ✭ 93 (-4.12%)
Mutual labels:  spark
Bitsofbytes
Code and projects from my blog posts.
Stars: ✭ 89 (-8.25%)
Mutual labels:  json
Redisjson Py
An extension to redis-py for using Redis' ReJSON module
Stars: ✭ 89 (-8.25%)
Mutual labels:  json
Jsonmasking
Replace fields in json, replacing by something, don't care if property is in depth objects. Very useful to replace passwords credit card number, etc.
Stars: ✭ 95 (-2.06%)
Mutual labels:  json
1-60 of 1760 similar projects