Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-40.21%)
Mutual labels: json, spark, avro, parquet
Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+318.56%)
Mutual labels: json, spark, avro, parquet
IcebergIceberg is a table format for large, slow-moving tabular data
Stars: ✭ 393 (+305.15%)
Mutual labels: spark, avro, parquet
ChoetlETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+283.51%)
Mutual labels: json, avro, parquet
Vscode Data PreviewData Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Stars: ✭ 245 (+152.58%)
Mutual labels: json, avro, parquet
OapOptimized Analytics Package for Spark* Platform
Stars: ✭ 343 (+253.61%)
Mutual labels: spark, parquet
RatatoolA tool for data sampling, data generation, and data diffing
Stars: ✭ 279 (+187.63%)
Mutual labels: avro, parquet
VisidataA terminal spreadsheet multitool for discovering and arranging data
Stars: ✭ 4,606 (+4648.45%)
Mutual labels: json, tsv
Pytablewriterpytablewriter is a Python library to write a table in various formats: CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.
Stars: ✭ 422 (+335.05%)
Mutual labels: json, tsv
SqlitebiterA CLI tool to convert CSV / Excel / HTML / JSON / Jupyter Notebook / LDJSON / LTSV / Markdown / SQLite / SSV / TSV / Google-Sheets to a SQLite database file.
Stars: ✭ 601 (+519.59%)
Mutual labels: json, tsv
Structured Text ToolsA list of command line tools for manipulating structured text data
Stars: ✭ 6,180 (+6271.13%)
Mutual labels: json, tsv
Elasticsearch loaderA tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Stars: ✭ 300 (+209.28%)
Mutual labels: json, parquet
Sqswiss-army knife for data
Stars: ✭ 275 (+183.51%)
Mutual labels: json, tsv
Parquet GeneratorParquet file generator
Stars: ✭ 16 (-83.51%)
Mutual labels: spark, parquet
Kafka Storm StarterCode examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (+650.52%)
Mutual labels: spark, avro
PucketBucketing and partitioning system for Parquet
Stars: ✭ 29 (-70.1%)
Mutual labels: spark, parquet
SqawkLike Awk but with SQL and table joins
Stars: ✭ 263 (+171.13%)
Mutual labels: json, tsv
confluent-spark-avroSpark UDFs to deserialize Avro messages with schemas stored in Schema Registry.
Stars: ✭ 18 (-81.44%)
Mutual labels: spark, avro
qweryA SQL-like language for performing ETL transformations.
Stars: ✭ 28 (-71.13%)
Mutual labels: tsv, avro
Pmacctpmacct is a small set of multi-purpose passive network monitoring tools [NetFlow IPFIX sFlow libpcap BGP BMP RPKI IGP Streaming Telemetry].
Stars: ✭ 677 (+597.94%)
Mutual labels: json, avro