gomrjobgomrjob - a Go Framework for Hadoop Map Reduce Jobs
Stars: ✭ 39 (-57.14%)
webhdfsNode.js WebHDFS REST API client
Stars: ✭ 88 (-3.3%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+136.26%)
LogAnalyzeHelper论坛日志分析系统清洗程序(包含IP规则库,UDF开发,MapReduce程序,日志数据)
Stars: ✭ 33 (-63.74%)
orionManagement and automation platform for Stateful Distributed Systems
Stars: ✭ 77 (-15.38%)
skeinA tool and library for easily deploying applications on Apache YARN
Stars: ✭ 128 (+40.66%)
Devops Bash Tools550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...
Stars: ✭ 226 (+148.35%)
hive to es同步Hive数据仓库数据到Elasticsearch的小工具
Stars: ✭ 21 (-76.92%)
disk基于hadoop+hbase+springboot实现分布式网盘系统
Stars: ✭ 53 (-41.76%)
TonYTonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Stars: ✭ 687 (+654.95%)
oci-clouderaTerraform module to deploy Cloudera on Oracle Cloud Infrastructure (OCI)
Stars: ✭ 20 (-78.02%)
JavaFrameworkSimple Java Framework,designed for easily develop Spring based java program.Support Bigdata And metadata management.A common elasticsearch comm query tool and so on.
Stars: ✭ 16 (-82.42%)
RecommendationEngineSource code and dataset for paper "CBMR: An optimized MapReduce for item‐based collaborative filtering recommendation algorithm with empirical analysis"
Stars: ✭ 43 (-52.75%)
sparkucxA high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (-64.84%)
LuigiLuigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Stars: ✭ 15,226 (+16631.87%)
disqA library for manipulating bioinformatics sequencing formats in Apache Spark
Stars: ✭ 29 (-68.13%)
DBTestCompareApplication to compare results of two SQL queries
Stars: ✭ 15 (-83.52%)
learning-hadoop-and-sparkCompanion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+60.44%)
pyspark-ML-in-ColabPyspark in Google Colab: A simple machine learning (Linear Regression) model
Stars: ✭ 32 (-64.84%)
openPDCOpen Source Phasor Data Concentrator
Stars: ✭ 109 (+19.78%)
learning-sparkTidy up Spark and Hadoop tutorials.
Stars: ✭ 28 (-69.23%)
dpkb大数据相关内容汇总,包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词:Hadoop、HBase、ES、Kudu、Hive、Presto、Spark、Flink、Kylin、ClickHouse
Stars: ✭ 123 (+35.16%)
big-data-exploration[Archive] Intern project - Big Data Exploration using MongoDB - This Repository is NOT a supported MongoDB product
Stars: ✭ 43 (-52.75%)
memex-gateGeneral Architecture for Text Engineering
Stars: ✭ 47 (-48.35%)
terasliceScalable data processing pipelines in JavaScript
Stars: ✭ 48 (-47.25%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-57.14%)
beanszooDistributed Java micro-services using ZooKeeper
Stars: ✭ 12 (-86.81%)
hadoop-ansibleInstall hadoop cluster with ansible
Stars: ✭ 35 (-61.54%)
dockerfilesMulti docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (-68.13%)
ambari-hdp-dockerDockerfiles and Docker Compose for HDP 2.6 with Blueprints
Stars: ✭ 23 (-74.73%)
liquibase-impalaLiquibase extension to add Impala Database support
Stars: ✭ 23 (-74.73%)
phoenixApache Phoenix / Hbase Spring Boot Microservices
Stars: ✭ 23 (-74.73%)
the-apache-ignite-bookAll code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Stars: ✭ 65 (-28.57%)
docker-hadoopDocker image for main Apache Hadoop components (Yarn/Hdfs)
Stars: ✭ 59 (-35.16%)
xxhadoopData Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Stars: ✭ 37 (-59.34%)
Hadoop Attack LibraryA collection of pentest tools and resources targeting Hadoop environments
Stars: ✭ 228 (+150.55%)
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-82.42%)
Hadoop ConnectorsLibraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
Stars: ✭ 218 (+139.56%)
hadoopofficeHadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (-38.46%)
CalciteApache Calcite
Stars: ✭ 2,816 (+2994.51%)
ShifuAn end-to-end machine learning and data mining framework on Hadoop
Stars: ✭ 207 (+127.47%)
corcAn ORC File Scheme for the Cascading data processing platform.
Stars: ✭ 14 (-84.62%)
hadoop-etl-udfsThe Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Stars: ✭ 17 (-81.32%)
rastercuberastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)
Stars: ✭ 15 (-83.52%)
smart-data-lakeSmart Automation Tool for building modern Data Lakes and Data Pipelines
Stars: ✭ 79 (-13.19%)