Data Algorithms Book MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (+1625.45%)
MareMaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.
Stars: ✭ 11 (-80%)
Bigdata Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+1458.18%)
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+1589.09%)
Corral🐎 A serverless MapReduce framework written for AWS Lambda
Stars: ✭ 648 (+1078.18%)
CdapAn open source framework for building data analytic applications.
Stars: ✭ 509 (+825.45%)
Bigdata💎🔥大数据学习笔记
Stars: ✭ 488 (+787.27%)
BigsliceA serverless cluster computing system for the Go programming language
Stars: ✭ 469 (+752.73%)
Bdp Dataplatform大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (+729.09%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+39987.27%)
CascadingCascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster. See https://github.com/Cascading/cascading for the release repository.
Stars: ✭ 318 (+478.18%)
BehemothBehemoth is an open source platform for large scale document analysis based on Apache Hadoop.
Stars: ✭ 286 (+420%)
RedissonRedisson - Redis Java client with features of In-Memory Data Grid. Over 50 Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Publish / Subscribe, Bloom filter, Spring Cache, Tomcat, Scheduler, JCache API, Hibernate, MyBatis, RPC, local cache ...
Stars: ✭ 17,972 (+32576.36%)
Tdigestt-Digest data structure in Python. Useful for percentiles and quantiles, including distributed enviroments like PySpark
Stars: ✭ 274 (+398.18%)
GuitarA Simple and Efficient Distributed Multidimensional BI Analysis Engine.
Stars: ✭ 86 (+56.36%)
st-hadoopST-Hadoop is an open-source MapReduce extension of Hadoop designed specially to analyze your spatio-temporal data efficiently
Stars: ✭ 17 (-69.09%)
dtailDTail is a distributed DevOps tool for tailing, grepping, catting logs and other text files on many remote machines at once.
Stars: ✭ 112 (+103.64%)
connected-componentMap Reduce Implementation of Connected Component on Apache Spark
Stars: ✭ 68 (+23.64%)
big dataA collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-38.18%)
mapreduceA in-process MapReduce library to help you optimizing service response time or concurrent task processing.
Stars: ✭ 93 (+69.09%)
infantryRun MapReduce in user's browser.
Stars: ✭ 14 (-74.55%)
durablefunctions-mapreduce-dotnetAn implementation of MapReduce on top of C# Durable Functions over the NYC 2017 Taxi dataset to compute average ride time per-day
Stars: ✭ 20 (-63.64%)
MLBDMaterials for "Machine Learning on Big Data" course
Stars: ✭ 20 (-63.64%)
oosoJava library for running Serverless MapReduce jobs
Stars: ✭ 25 (-54.55%)
railScalable RNA-seq analysis
Stars: ✭ 74 (+34.55%)
interview-refresh-java-bigdataa one-stop repo to lookup for code snippets of core java concepts, sql, data structures as well as big data. It also consists of interview questions asked in real-life.
Stars: ✭ 25 (-54.55%)
etranErlang Parse Transforms Including Fold (MapReduce) comprehension, Elixir-like Pipeline, and default function arguments
Stars: ✭ 19 (-65.45%)
HadoopDedup🍉基于Hadoop和HBase的大规模海量数据去重
Stars: ✭ 27 (-50.91%)
pyspark-algorithmsPySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (+30.91%)
lectures-hse-sparkМасштабируемое машинное обучение и анализ больших данных с Apache Spark
Stars: ✭ 20 (-63.64%)
learning-hadoop-and-sparkCompanion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+165.45%)
gomrjobgomrjob - a Go Framework for Hadoop Map Reduce Jobs
Stars: ✭ 39 (-29.09%)
DparkPython clone of Spark, a MapReduce alike framework in Python
Stars: ✭ 2,668 (+4750.91%)
PowerjobEnterprise job scheduling middleware with distributed computing ability.
Stars: ✭ 3,231 (+5774.55%)
6.824 2017⚡️ 6.824: Distributed Systems (Spring 2017). A course which present abstractions and implementation techniques for engineering distributed systems.
Stars: ✭ 219 (+298.18%)
RedisgearsDynamic execution framework for your Redis data
Stars: ✭ 152 (+176.36%)
AsakusafwAsakusa Framework
Stars: ✭ 114 (+107.27%)
Avro Hadoop StarterExample MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.
Stars: ✭ 110 (+100%)
DamprPython Data Processing library
Stars: ✭ 102 (+85.45%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (+67.27%)
MapreduceMapReduce by examples
Stars: ✭ 91 (+65.45%)
SrcA light-weight distributed stream computing framework for Golang
Stars: ✭ 67 (+21.82%)