Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+45833.33%)

Mutual labels: mapreduce

Guitar

A Simple and Efficient Distributed Multidimensional BI Analysis Engine.

Stars: ✭ 86 (+79.17%)

Mutual labels: mapreduce

Rafty

Implementation of RAFT consensus in .NET core

Stars: ✭ 182 (+279.17%)

Mutual labels: raft-consensus-algorithm

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+1685.42%)

Mutual labels: mapreduce

mapreduce-examples

A collection of mapreduce problems and solutions

Stars: ✭ 23 (-52.08%)

Mutual labels: mapreduce

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+22797.92%)

Mutual labels: mapreduce

infantry

Run MapReduce in user's browser.

Stars: ✭ 14 (-70.83%)

Mutual labels: mapreduce

Distributed Computing

distributed_computing include mapreduce kvstore etc.

Stars: ✭ 654 (+1262.5%)

Mutual labels: mapreduce

ooso

Java library for running Serverless MapReduce jobs

Stars: ✭ 25 (-47.92%)

Mutual labels: mapreduce

Redisgears

Dynamic execution framework for your Redis data

Stars: ✭ 152 (+216.67%)

Mutual labels: mapreduce

rail

Scalable RNA-seq analysis

Stars: ✭ 74 (+54.17%)

Mutual labels: mapreduce

Bigslice

A serverless cluster computing system for the Go programming language

Stars: ✭ 469 (+877.08%)

Mutual labels: mapreduce

pyspark-algorithms

PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2

Stars: ✭ 72 (+50%)

Mutual labels: mapreduce

Src

A light-weight distributed stream computing framework for Golang

Stars: ✭ 67 (+39.58%)

Mutual labels: mapreduce

Braft

An industrial-grade C++ implementation of RAFT consensus algorithm based on brpc, widely used inside Baidu to build highly-available distributed systems.

Stars: ✭ 2,964 (+6075%)

Mutual labels: raft-consensus-algorithm

Behemoth

Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.

Stars: ✭ 286 (+495.83%)

Mutual labels: mapreduce

data-algorithms-with-spark

O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian

Stars: ✭ 34 (-29.17%)

Mutual labels: mapreduce

Atomix

A reactive Java framework for building fault-tolerant distributed systems

Stars: ✭ 2,182 (+4445.83%)

Mutual labels: raft-consensus-algorithm

Mare

MaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.

Stars: ✭ 11 (-77.08%)

Mutual labels: mapreduce

st-hadoop

ST-Hadoop is an open-source MapReduce extension of Hadoop designed specially to analyze your spatio-temporal data efficiently

Stars: ✭ 17 (-64.58%)

Mutual labels: mapreduce

Dampr

Python Data Processing library

Stars: ✭ 102 (+112.5%)

Mutual labels: mapreduce

connected-component

Map Reduce Implementation of Connected Component on Apache Spark

Stars: ✭ 68 (+41.67%)

Mutual labels: mapreduce

Coursera Uw Machine Learning Clustering Retrieval

Stars: ✭ 25 (-47.92%)

Mutual labels: mapreduce

big data

A collection of tutorials on Hadoop, MapReduce, Spark, Docker

Stars: ✭ 34 (-29.17%)

Mutual labels: mapreduce

6.824 2017

⚡️ 6.824: Distributed Systems (Spring 2017). A course which present abstractions and implementation techniques for engineering distributed systems.

Stars: ✭ 219 (+356.25%)

Mutual labels: mapreduce

mapreduce

A in-process MapReduce library to help you optimizing service response time or concurrent task processing.

Stars: ✭ 93 (+93.75%)

Mutual labels: mapreduce

Yandex Big Data Engineering

Stars: ✭ 17 (-64.58%)

Mutual labels: mapreduce

durablefunctions-mapreduce-dotnet

An implementation of MapReduce on top of C# Durable Functions over the NYC 2017 Taxi dataset to compute average ride time per-day

Stars: ✭ 20 (-58.33%)

Mutual labels: mapreduce

Repository

个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。

Stars: ✭ 92 (+91.67%)

Mutual labels: mapreduce

MLBD

Materials for "Machine Learning on Big Data" course

Stars: ✭ 20 (-58.33%)

Mutual labels: mapreduce

Corral

🐎 A serverless MapReduce framework written for AWS Lambda

Stars: ✭ 648 (+1250%)

Mutual labels: mapreduce

ParallelUtilities.jl

Fast and easy parallel mapreduce on HPC clusters

Stars: ✭ 28 (-41.67%)

Mutual labels: mapreduce

Dpark

Python clone of Spark, a MapReduce alike framework in Python

Stars: ✭ 2,668 (+5458.33%)

Mutual labels: mapreduce

Data-pipeline-project

Data pipeline project

Stars: ✭ 18 (-62.5%)

Mutual labels: mapreduce

Bigdata

💎🔥大数据学习笔记

Stars: ✭ 488 (+916.67%)

Mutual labels: mapreduce

interview-refresh-java-bigdata

a one-stop repo to lookup for code snippets of core java concepts, sql, data structures as well as big data. It also consists of interview questions asked in real-life.

Stars: ✭ 25 (-47.92%)

Mutual labels: mapreduce

Big Data Engineering Coursera Yandex

Big Data for Data Engineers Coursera Specialization from Yandex

Stars: ✭ 71 (+47.92%)

Mutual labels: mapreduce

HadoopDedup

🍉基于Hadoop和HBase的大规模海量数据去重

Stars: ✭ 27 (-43.75%)

Mutual labels: mapreduce

Bdp Dataplatform

大数据生态解决方案数据平台：基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。

Stars: ✭ 456 (+850%)

Mutual labels: mapreduce

lectures-hse-spark

Масштабируемое машинное обучение и анализ больших данных с Apache Spark

Stars: ✭ 20 (-58.33%)

Mutual labels: mapreduce

Asakusafw

Asakusa Framework

Stars: ✭ 114 (+137.5%)

Mutual labels: mapreduce

bigdata-doc

大数据学习笔记，学习路线，技术案例整理。

Stars: ✭ 37 (-22.92%)

Mutual labels: mapreduce

Cascading

Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster. See https://github.com/Cascading/cascading for the release repository.

Stars: ✭ 318 (+562.5%)

Mutual labels: mapreduce

Ckite

CKite - A JVM implementation of the Raft distributed consensus algorithm written in Scala

Stars: ✭ 214 (+345.83%)

Mutual labels: raft-consensus-algorithm

Elixir Iteraptor

Handy enumerable operations implementation.

Stars: ✭ 55 (+14.58%)

Mutual labels: mapreduce

Redisson

Redisson - Redis Java client with features of In-Memory Data Grid. Over 50 Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Publish / Subscribe, Bloom filter, Spring Cache, Tomcat, Scheduler, JCache API, Hibernate, MyBatis, RPC, local cache ...

Stars: ✭ 17,972 (+37341.67%)

Mutual labels: mapreduce

gomrjob

gomrjob - a Go Framework for Hadoop Map Reduce Jobs

Stars: ✭ 39 (-18.75%)

Mutual labels: mapreduce

Powerjob

Enterprise job scheduling middleware with distributed computing ability.

Stars: ✭ 3,231 (+6631.25%)

Mutual labels: mapreduce

Avro Hadoop Starter

Example MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.