bandar-logMonitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 20 (-83.74%)
KyuubiKyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (+195.12%)
beekeeperService for automatically managing and cleaning up unreferenced data
Stars: ✭ 43 (-65.04%)
UBAUEBA Solution for Insider Security. This repo is archived. Thanks!
Stars: ✭ 36 (-70.73%)
QuixQuix Notebook Manager
Stars: ✭ 184 (+49.59%)
hadoop-cryptoLibrary for per-file client-side encyption in Hadoop FileSystems such as HDFS or S3.
Stars: ✭ 38 (-69.11%)
data-profilinga set of scripts to pull meta data and data profiling metrics from relational database systems
Stars: ✭ 57 (-53.66%)
learning-sparkTidy up Spark and Hadoop tutorials.
Stars: ✭ 28 (-77.24%)
hiveql-parserHiveQL Parser. Parse HiveQL code and print AST in JSON format if success, else print well formed syntax error message.
Stars: ✭ 25 (-79.67%)
Hive Third FunctionsSome useful custom hive udf functions, especial array, json, math, string functions.
Stars: ✭ 151 (+22.76%)
disqA library for manipulating bioinformatics sequencing formats in Apache Spark
Stars: ✭ 29 (-76.42%)
kubesqlA tool based on presto using sql to query the resources of kubernetes, such as pods, nodes and so on.
Stars: ✭ 56 (-54.47%)
Hadoop PotA scalable Apache Hadoop-based implementation of the Pooled Time Series video similarity algorithm based on M. Ryoo et al paper CVPR 2015.
Stars: ✭ 8 (-93.5%)
Stormtweetssentimentd3vizComputes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.
Stars: ✭ 25 (-79.67%)
big-data-exploration[Archive] Intern project - Big Data Exploration using MongoDB - This Repository is NOT a supported MongoDB product
Stars: ✭ 43 (-65.04%)
fenseFense is a database proxy written in Java, which can connect DB of different engines at the same time. The key features are: authority management, query cache, audit security, current limiting fuse, onesql and so on
Stars: ✭ 22 (-82.11%)
Spark AuthorizerA Spark SQL extension which provides SQL Standard Authorization for Apache Spark
Stars: ✭ 141 (+14.63%)
KyloKylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Stars: ✭ 916 (+644.72%)
qweryA SQL-like language for performing ETL transformations.
Stars: ✭ 28 (-77.24%)
bigkubeMinikube for big data with Scala and Spark
Stars: ✭ 16 (-86.99%)
Hadoop For GeoeventArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-95.93%)
Winutilswinutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
Stars: ✭ 657 (+434.15%)
openPDCOpen Source Phasor Data Concentrator
Stars: ✭ 109 (-11.38%)
last fmA simple app to demonstrate a testable, maintainable, and scalable architecture for flutter. flutter_bloc, get_it, hive, and REST API are some of the tech stacks used in this project.
Stars: ✭ 134 (+8.94%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+75.61%)
reglnWindows Rregistry Linking Utility
Stars: ✭ 38 (-69.11%)
spark-acidACID Data Source for Apache Spark based on Hive ACID
Stars: ✭ 91 (-26.02%)
TonyTonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Stars: ✭ 626 (+408.94%)
MzingaOpen-source software to play the board game Hive.
Stars: ✭ 57 (-53.66%)
docker-presto-adls-wasbExample of a single node Presto with Azure Data Lake Store (ADLS) and Azure Storage Blob (WASB) access via Hive metastore
Stars: ✭ 16 (-86.99%)
Javapdf🍣100本 Java电子书 技术书籍PDF(以下载阅读为荣,以点赞收藏为耻)
Stars: ✭ 609 (+395.12%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+4498.37%)
Awesome HbaseA curated list of awesome HBase projects and resources.
Stars: ✭ 140 (+13.82%)
Dist KerasDistributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Stars: ✭ 613 (+398.37%)
Hadoop Attack LibraryA collection of pentest tools and resources targeting Hadoop environments
Stars: ✭ 228 (+85.37%)
Myblog有深度的Java技术博客
Stars: ✭ 1,251 (+917.07%)
Beezig🐝 Beezig - The Hive plugin for 5zig.
Stars: ✭ 16 (-86.99%)
Hadoop study定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Stars: ✭ 567 (+360.98%)
Cloud Note基于分布式的云笔记(参考某道云笔记),数据存储在redis与hbase中
Stars: ✭ 71 (-42.28%)
Hadoop ConnectorsLibraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
Stars: ✭ 218 (+77.24%)
HeraclesHigh performance HBase / Spark SQL engine
Stars: ✭ 27 (-78.05%)
Gis Tools For HadoopThe GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of big data.
Stars: ✭ 485 (+294.31%)
CalciteApache Calcite
Stars: ✭ 2,816 (+2189.43%)
HiveRunnerAn Open Source unit test framework for Hive queries based on JUnit 4 and 5
Stars: ✭ 244 (+98.37%)
radiatorHive Ruby API Client
Stars: ✭ 49 (-60.16%)
flink-clientJava library for managing Apache Flink via the Monitoring REST API
Stars: ✭ 48 (-60.98%)
beemosBEE MOnitoring System: create an infrastructure for monitoring beehives
Stars: ✭ 16 (-86.99%)
presto-client-phpA Presto client for the PHP programming language.
Stars: ✭ 24 (-80.49%)
docker-hiveDocker image for Apache Hive Metastore
Stars: ✭ 42 (-65.85%)