pulsephData Pulse application log aggregation and monitoring
Stars: ✭ 13 (-88.18%)
reglnWindows Rregistry Linking Utility
Stars: ✭ 38 (-65.45%)
DatabookA facebook for data
Stars: ✭ 26 (-76.36%)
MzingaOpen-source software to play the board game Hive.
Stars: ✭ 57 (-48.18%)
confluent-spark-avroSpark UDFs to deserialize Avro messages with schemas stored in Schema Registry.
Stars: ✭ 18 (-83.64%)
Docker HadoopApache Hadoop docker image
Stars: ✭ 1,190 (+981.82%)
avro-schema-generatorLibrary for generating avro schema files (.avsc) based on DB tables structure
Stars: ✭ 38 (-65.45%)
Stormtweetssentimentd3vizComputes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.
Stars: ✭ 25 (-77.27%)
TiBigDataTiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+74.55%)
apiaryApiary provides modules which can be combined to create a federated cloud data lake
Stars: ✭ 30 (-72.73%)
radiatorHive Ruby API Client
Stars: ✭ 49 (-55.45%)
GuitarA Simple and Efficient Distributed Multidimensional BI Analysis Engine.
Stars: ✭ 86 (-21.82%)
TonYTonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Stars: ✭ 687 (+524.55%)
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+744.55%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (+0.91%)
Go Kafka AvroA library provides consumer/producer to work with kafka, avro and schema registry
Stars: ✭ 39 (-64.55%)
Spiderman基于 scrapy-redis 的通用分布式爬虫框架
Stars: ✭ 392 (+256.36%)
avro turfA library that makes it easier to use the Avro serialization format from Ruby.
Stars: ✭ 130 (+18.18%)
terasliceScalable data processing pipelines in JavaScript
Stars: ✭ 48 (-56.36%)
dtailDTail is a distributed DevOps tool for tailing, grepping, catting logs and other text files on many remote machines at once.
Stars: ✭ 112 (+1.82%)
JavaFrameworkSimple Java Framework,designed for easily develop Spring based java program.Support Bigdata And metadata management.A common elasticsearch comm query tool and so on.
Stars: ✭ 16 (-85.45%)
orionManagement and automation platform for Stateful Distributed Systems
Stars: ✭ 77 (-30%)
AvroConvertApache Avro serializer for .NET
Stars: ✭ 44 (-60%)
hadoop-ansibleInstall hadoop cluster with ansible
Stars: ✭ 35 (-68.18%)
MagnolifyA collection of Magnolia add-on modules
Stars: ✭ 81 (-26.36%)
avro-serde-phpAvro Serialisation/Deserialisation (SerDe) library for PHP 7.3+ & 8.0 with a Symfony Serializer integration
Stars: ✭ 43 (-60.91%)
ambari-hdp-dockerDockerfiles and Docker Compose for HDP 2.6 with Blueprints
Stars: ✭ 23 (-79.09%)
dotnet-avroAn Avro implementation for .NET
Stars: ✭ 60 (-45.45%)
OrcApache ORC - the smallest, fastest columnar storage for Hadoop workloads
Stars: ✭ 389 (+253.64%)
UBAUEBA Solution for Insider Security. This repo is archived. Thanks!
Stars: ✭ 36 (-67.27%)
spark-acidACID Data Source for Apache Spark based on Hive ACID
Stars: ✭ 91 (-17.27%)
Hadoop For GeoeventArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-95.45%)
LuigiLuigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Stars: ✭ 15,226 (+13741.82%)
avro-typescriptTypeScript Code Generator for Apache Avro Schema Types
Stars: ✭ 19 (-82.73%)
MahaA framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.
Stars: ✭ 101 (-8.18%)
IgniteApache Ignite
Stars: ✭ 4,027 (+3560.91%)
implyrSQL backend to dplyr for Impala
Stars: ✭ 74 (-32.73%)
docker-hiveDocker image for Apache Hive Metastore
Stars: ✭ 42 (-61.82%)
Kafka Storm StarterCode examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (+561.82%)
Awesome Learning实践源码库:https://github.com/jast90/bigdata 。 微信搜索Jast关注公众号,获取最新技术分享😯。
Stars: ✭ 197 (+79.09%)
yuzhouwanCode Library for My Blog
Stars: ✭ 39 (-64.55%)
hadoop-cryptoLibrary for per-file client-side encyption in Hadoop FileSystems such as HDFS or S3.
Stars: ✭ 38 (-65.45%)
BigdlBuilding Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+3366.36%)
datasqueezeHadoop utility to compact small files
Stars: ✭ 18 (-83.64%)
hive-cubeData self exporting and monitoring platform based on Hive data warehouse. https://hc.smartloli.org
Stars: ✭ 34 (-69.09%)
ChoetlETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+238.18%)
waspWASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Stars: ✭ 19 (-82.73%)
oosoJava library for running Serverless MapReduce jobs
Stars: ✭ 25 (-77.27%)
prestoTeradata Distribution of Presto -- A Distributed SQL Query Engine for Big Data
Stars: ✭ 91 (-17.27%)