Bigdata File ViewerA cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (+207.14%)
Gcs ToolsGCS support for avro-tools, parquet-tools and protobuf
Stars: ✭ 57 (+103.57%)
ChoetlETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+1228.57%)
parquet-extraA collection of Apache Parquet add-on modules
Stars: ✭ 30 (+7.14%)
AvroApache Avro is a data serialization system.
Stars: ✭ 2,005 (+7060.71%)
Bigdata PlaygroundA complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+532.14%)
Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+1350%)
Vscode Data PreviewData Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Stars: ✭ 245 (+775%)
RatatoolA tool for data sampling, data generation, and data diffing
Stars: ✭ 279 (+896.43%)
centurionKotlin Bigdata Toolkit
Stars: ✭ 320 (+1042.86%)
IcebergIceberg is a table format for large, slow-moving tabular data
Stars: ✭ 393 (+1303.57%)
Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (+107.14%)
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-14.29%)
SchemerSchema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (+246.43%)
bigquery-data-lineageReference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
Stars: ✭ 112 (+300%)
srclientGolang Client for Schema Registry
Stars: ✭ 188 (+571.43%)
qsvCSVs sliced, diced & analyzed.
Stars: ✭ 438 (+1464.29%)
anovosAnovos - An Open Source Library for Scalable feature engineering Using Apache-Spark
Stars: ✭ 77 (+175%)
dt-sql-parserSQL Parsers for BigData, built with antlr4.
Stars: ✭ 135 (+382.14%)
sbt-avroPlugin SBT to Generate Scala classes from Apache Avro schemas hosted on a remote Confluent Schema Registry.
Stars: ✭ 15 (-46.43%)
avro-serde-phpAvro Serialisation/Deserialisation (SerDe) library for PHP 7.3+ & 8.0 with a Symfony Serializer integration
Stars: ✭ 43 (+53.57%)
optimus🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+4725%)
PersonNotes个人笔记集中营,快糙猛的形式记录技术性Notes .. 📚☕️⌨️🎧
Stars: ✭ 61 (+117.86%)
amasAmas is recursive acronym for “Amas, monitor alert system”.
Stars: ✭ 77 (+175%)
intersect一道面试题的思考 - 6000万数据包和300万数据包在50M内存使用环境中求交集
Stars: ✭ 54 (+92.86%)
miniparquetLibrary to read a subset of Parquet files
Stars: ✭ 38 (+35.71%)
albisAlbis: High-Performance File Format for Big Data Systems
Stars: ✭ 20 (-28.57%)
kafka-scala-examplesExamples of Avro, Kafka, Schema Registry, Kafka Streams, Interactive Queries, KSQL, Kafka Connect in Scala
Stars: ✭ 53 (+89.29%)
hayabusaHayabusa: Simple and Fast Full-Text Search Engine for Massive System Log Data
Stars: ✭ 43 (+53.57%)
avrowAvrow is a pure Rust implementation of the avro specification https://avro.apache.org/docs/current/spec.html with Serde support.
Stars: ✭ 27 (-3.57%)
tamerStandalone alternatives to Kafka Connect Connectors
Stars: ✭ 42 (+50%)
dockerfilesMulti docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (+3.57%)
workflUXAn open-source, cloud-ready web application for simplified deployment of big data workflows.
Stars: ✭ 26 (-7.14%)
openmrs-fhir-analyticsA collection of tools for extracting FHIR resources and analytics services on top of that data.
Stars: ✭ 55 (+96.43%)
greycatGreyCat - Data Analytics, Temporal data, What-if, Live machine learning
Stars: ✭ 104 (+271.43%)
codefoundryExamples for gauravbytes.com
Stars: ✭ 57 (+103.57%)
NotesThis is a learning note | Java基础,JVM,源码,大数据,面经
Stars: ✭ 69 (+146.43%)
schema-registry-php-clientA PHP 7.3+ API client for the Confluent Schema Registry REST API based on Guzzle 6 - http://docs.confluent.io/current/schema-registry/docs/index.html
Stars: ✭ 40 (+42.86%)
lectures-hse-sparkМасштабируемое машинное обучение и анализ больших данных с Apache Spark
Stars: ✭ 20 (-28.57%)
StreamBenchMeasuring the performance of popular streaming engines with Yahoo's Streaming Benchmark
Stars: ✭ 52 (+85.71%)
fast-avro-writeWriting an Avro file is not as fast as you might want it. This is a library to write considerably faster to an avro file.
Stars: ✭ 32 (+14.29%)
StoragetapperStorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
Stars: ✭ 232 (+728.57%)
Jackson Dataformats BinaryUber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Stars: ✭ 221 (+689.29%)
Mu HaskellMu (μ) is a purely functional framework for building micro services.
Stars: ✭ 215 (+667.86%)
singlestore-logistics-simScalable package delivery logistics simulator built using SingleStore and Vectorized Redpanda
Stars: ✭ 31 (+10.71%)
the-apache-ignite-bookAll code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Stars: ✭ 65 (+132.14%)
TiBigDataTiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+585.71%)
KafkactlCommand Line Tool for managing Apache Kafka
Stars: ✭ 177 (+532.14%)
Gradle Avro PluginA Gradle plugin to allow easily performing Java code generation for Apache Avro. It supports JSON schema declaration files, JSON protocol declaration files, and Avro IDL files.
Stars: ✭ 176 (+528.57%)
awesome-coder-resources编程路上加油站!------【持续更新中...欢迎star,欢迎常回来看看......】【内容:编程/学习/阅读资源,开源项目,面试题,网站,书,博客,教程等等】
Stars: ✭ 54 (+92.86%)