All Projects → MLHadoop → Similar Projects or Alternatives

230 Open source projects that are alternatives of or similar to MLHadoop

hive to es

同步Hive数据仓库数据到Elasticsearch的小工具

Stars: ✭ 21 (-58%)

Mutual labels: hadoop

TonY

TonY is a framework to natively run deep learning frameworks on Apache Hadoop.

Stars: ✭ 687 (+1274%)

Mutual labels: hadoop

skein

A tool and library for easily deploying applications on Apache YARN

Stars: ✭ 128 (+156%)

Mutual labels: hadoop

Data-pipeline-project

Data pipeline project

Stars: ✭ 18 (-64%)

Mutual labels: hadoop

RecommendationEngine

Source code and dataset for paper "CBMR: An optimized MapReduce for item‐based collaborative filtering recommendation algorithm with empirical analysis"

Stars: ✭ 43 (-14%)

Mutual labels: hadoop

rastercube

rastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)

Stars: ✭ 15 (-70%)

Mutual labels: hadoop

bigdata-doc

大数据学习笔记，学习路线，技术案例整理。

Stars: ✭ 37 (-26%)

Mutual labels: hadoop

wasp

WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.

Stars: ✭ 19 (-62%)

Mutual labels: hadoop

JavaFramework

Simple Java Framework,designed for easily develop Spring based java program.Support Bigdata And metadata management.A common elasticsearch comm query tool and so on.

Stars: ✭ 16 (-68%)

Mutual labels: hadoop

BigInsights-on-Apache-Hadoop

Example projects for 'BigInsights for Apache Hadoop' on IBM Bluemix

Stars: ✭ 21 (-58%)

Mutual labels: hadoop

qs-hadoop

大数据生态圈学习

Stars: ✭ 18 (-64%)

Mutual labels: hadoop

bigdatatutorial

Stars: ✭ 34 (-32%)

Mutual labels: hadoop

sparkucx

A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer

Stars: ✭ 32 (-36%)

Mutual labels: hadoop

hive-bigquery-storage-handler

Hive Storage Handler for interoperability between BigQuery and Apache Hive

Stars: ✭ 16 (-68%)

Mutual labels: hadoop

hadoop-crypto

Library for per-file client-side encyption in Hadoop FileSystems such as HDFS or S3.

Stars: ✭ 38 (-24%)

Mutual labels: hadoop

smart-data-lake

Smart Automation Tool for building modern Data Lakes and Data Pipelines

Stars: ✭ 79 (+58%)

Mutual labels: hadoop

oci-cloudera

Terraform module to deploy Cloudera on Oracle Cloud Infrastructure (OCI)

Stars: ✭ 20 (-60%)

Mutual labels: hadoop

webhdfs

Node.js WebHDFS REST API client

Stars: ✭ 88 (+76%)

Mutual labels: hadoop

darwin

Avro Schema Evolution made easy

Stars: ✭ 26 (-48%)

Mutual labels: hadoop

gomrjob

gomrjob - a Go Framework for Hadoop Map Reduce Jobs

Stars: ✭ 39 (-22%)

Mutual labels: hadoop

disq

A library for manipulating bioinformatics sequencing formats in Apache Spark

Stars: ✭ 29 (-42%)

Mutual labels: hadoop

orion

Management and automation platform for Stateful Distributed Systems

Stars: ✭ 77 (+54%)

Mutual labels: hadoop

hadoop-ecosystem

Visualizations of the Hadoop Ecosystem

Stars: ✭ 20 (-60%)

Mutual labels: hadoop

kafka-connect-fs

Kafka Connect FileSystem Connector

Stars: ✭ 107 (+114%)

Mutual labels: hadoop

disk

基于hadoop+hbase+springboot实现分布式网盘系统

Stars: ✭ 53 (+6%)

Mutual labels: hadoop

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (-22%)

Mutual labels: hadoop

docker-hadoop

Docker image for main Apache Hadoop components (Yarn/Hdfs)

Stars: ✭ 59 (+18%)

Mutual labels: hadoop

memex-gate

General Architecture for Text Engineering

Stars: ✭ 47 (-6%)

Mutual labels: hadoop

dockerfiles

Multi docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )

Stars: ✭ 29 (-42%)

Mutual labels: hadoop

implyr

SQL backend to dplyr for Impala

Stars: ✭ 74 (+48%)

Mutual labels: hadoop

the-apache-ignite-book

All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above

Stars: ✭ 65 (+30%)

Mutual labels: hadoop

hadoopoffice

HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)

Stars: ✭ 56 (+12%)

Mutual labels: hadoop

iis

Information Inference Service of the OpenAIRE system

Stars: ✭ 16 (-68%)

Mutual labels: hadoop

aaocp

一个对用户行为日志进行分析的大数据项目

Stars: ✭ 53 (+6%)

Mutual labels: hadoop

HDFS-Netdisc

基于Hadoop的分布式云存储系统 🌴

Stars: ✭ 56 (+12%)

Mutual labels: hadoop

learning-spark

Tidy up Spark and Hadoop tutorials.

Stars: ✭ 28 (-44%)

Mutual labels: hadoop

learning-hadoop-and-spark

Companion to Learning Hadoop and Learning Spark courses on Linked In Learning

Stars: ✭ 146 (+192%)

Mutual labels: hadoop

datasqueeze

Hadoop utility to compact small files

Stars: ✭ 18 (-64%)

Mutual labels: hadoop

openPDC

Open Source Phasor Data Concentrator

Stars: ✭ 109 (+118%)

Mutual labels: hadoop

jmx exporter-cloudera-hadoop

Prometheus jmx_exporter configurations for Cloudera Hadoop

Stars: ✭ 33 (-34%)

Mutual labels: hadoop

dpkb

大数据相关内容汇总，包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词：Hadoop、HBase、ES、Kudu、Hive、Presto、Spark、Flink、Kylin、ClickHouse

Stars: ✭ 123 (+146%)

Mutual labels: hadoop

DaFlow

Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.

Stars: ✭ 24 (-52%)

Mutual labels: hadoop

yarn-prometheus-exporter

Export Hadoop YARN (resource-manager) metrics in prometheus format

Stars: ✭ 44 (-12%)

Mutual labels: hadoop

xxhadoop

Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !

Stars: ✭ 37 (-26%)

Mutual labels: hadoop

teraslice

Scalable data processing pipelines in JavaScript