550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...

Stars: ✭ 226 (+148.35%)

Mutual labels: hadoop

hive to es

同步Hive数据仓库数据到Elasticsearch的小工具

Stars: ✭ 21 (-76.92%)

Mutual labels: hadoop

bigdata-doc

大数据学习笔记，学习路线，技术案例整理。

Stars: ✭ 37 (-59.34%)

Mutual labels: hadoop

Javaorbigdata Interview

Java开发者或者大数据开发者面试知识点整理

Stars: ✭ 203 (+123.08%)

Mutual labels: hadoop

disk

基于hadoop+hbase+springboot实现分布式网盘系统

Stars: ✭ 53 (-41.76%)

Mutual labels: hadoop

TonY

TonY is a framework to natively run deep learning frameworks on Apache Hadoop.

Stars: ✭ 687 (+654.95%)

Mutual labels: hadoop

oci-cloudera

Terraform module to deploy Cloudera on Oracle Cloud Infrastructure (OCI)

Stars: ✭ 20 (-78.02%)

Mutual labels: hadoop

JavaFramework

Simple Java Framework,designed for easily develop Spring based java program.Support Bigdata And metadata management.A common elasticsearch comm query tool and so on.

Stars: ✭ 16 (-82.42%)

Mutual labels: hadoop

qs-hadoop

大数据生态圈学习

Stars: ✭ 18 (-80.22%)

Mutual labels: hadoop

RecommendationEngine

Source code and dataset for paper "CBMR: An optimized MapReduce for item‐based collaborative filtering recommendation algorithm with empirical analysis"

Stars: ✭ 43 (-52.75%)

Mutual labels: hadoop

sparkucx

A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer

Stars: ✭ 32 (-64.84%)

Mutual labels: hadoop

bigdatatutorial

Stars: ✭ 34 (-62.64%)

Mutual labels: hadoop

hive-bigquery-storage-handler

Hive Storage Handler for interoperability between BigQuery and Apache Hive

Stars: ✭ 16 (-82.42%)

Mutual labels: hadoop

Luigi

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

Stars: ✭ 15,226 (+16631.87%)

Mutual labels: hadoop

disq

A library for manipulating bioinformatics sequencing formats in Apache Spark

Stars: ✭ 29 (-68.13%)

Mutual labels: hadoop

Facebook Hive Udfs

Facebook's Hive UDFs

Stars: ✭ 213 (+134.07%)

Mutual labels: hadoop

DBTestCompare

Application to compare results of two SQL queries

Stars: ✭ 15 (-83.52%)

Mutual labels: teradata

learning-hadoop-and-spark

Companion to Learning Hadoop and Learning Spark courses on Linked In Learning

Stars: ✭ 146 (+60.44%)

Mutual labels: hadoop

Recommendsys

推荐项目（实时推荐和离线推荐）

Stars: ✭ 198 (+117.58%)

Mutual labels: hadoop

pyspark-ML-in-Colab

Pyspark in Google Colab: A simple machine learning (Linear Regression) model

Stars: ✭ 32 (-64.84%)

Mutual labels: hadoop

openPDC

Open Source Phasor Data Concentrator

Stars: ✭ 109 (+19.78%)

Mutual labels: hadoop

learning-spark

Tidy up Spark and Hadoop tutorials.

Stars: ✭ 28 (-69.23%)

Mutual labels: hadoop

dpkb

大数据相关内容汇总，包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词：Hadoop、HBase、ES、Kudu、Hive、Presto、Spark、Flink、Kylin、ClickHouse

Stars: ✭ 123 (+35.16%)

Mutual labels: hadoop

big-data-exploration

[Archive] Intern project - Big Data Exploration using MongoDB - This Repository is NOT a supported MongoDB product

Stars: ✭ 43 (-52.75%)

Mutual labels: hadoop

yarn-prometheus-exporter

Export Hadoop YARN (resource-manager) metrics in prometheus format

Stars: ✭ 44 (-51.65%)

Mutual labels: hadoop

memex-gate

General Architecture for Text Engineering

Stars: ✭ 47 (-48.35%)

Mutual labels: hadoop

teraslice

Scalable data processing pipelines in JavaScript

Stars: ✭ 48 (-47.25%)

Mutual labels: hadoop

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (-57.14%)

Mutual labels: hadoop

beanszoo

Distributed Java micro-services using ZooKeeper

Stars: ✭ 12 (-86.81%)

Mutual labels: hadoop

jmx exporter-cloudera-hadoop

Prometheus jmx_exporter configurations for Cloudera Hadoop

Stars: ✭ 33 (-63.74%)

Mutual labels: hadoop

hadoop-ansible

Install hadoop cluster with ansible

Stars: ✭ 35 (-61.54%)

Mutual labels: hadoop

dockerfiles

Multi docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )

Stars: ✭ 29 (-68.13%)

Mutual labels: hadoop

ambari-hdp-docker

Dockerfiles and Docker Compose for HDP 2.6 with Blueprints

Stars: ✭ 23 (-74.73%)

Mutual labels: hadoop

liquibase-impala

Liquibase extension to add Impala Database support

Stars: ✭ 23 (-74.73%)

Mutual labels: hadoop

phoenix

Apache Phoenix / Hbase Spring Boot Microservices