All Projects → caizkun → mapreduce-examples

caizkun / mapreduce-examples

Licence: other
A collection of mapreduce problems and solutions

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to mapreduce-examples

learning-hadoop-and-spark
Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+534.78%)
Mutual labels:  mapreduce
qs-hadoop
大数据生态圈学习
Stars: ✭ 18 (-21.74%)
Mutual labels:  mapreduce
durablefunctions-mapreduce-dotnet
An implementation of MapReduce on top of C# Durable Functions over the NYC 2017 Taxi dataset to compute average ride time per-day
Stars: ✭ 20 (-13.04%)
Mutual labels:  mapreduce
pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (+213.04%)
Mutual labels:  mapreduce
rail
Scalable RNA-seq analysis
Stars: ✭ 74 (+221.74%)
Mutual labels:  mapreduce
ParallelUtilities.jl
Fast and easy parallel mapreduce on HPC clusters
Stars: ✭ 28 (+21.74%)
Mutual labels:  mapreduce
mit-6.824-distributed-systems
Template repository to work on the labs from MIT 6.824 Distributed Systems course.
Stars: ✭ 48 (+108.7%)
Mutual labels:  mapreduce
GooglePlay-Web-Crawler
Mapreduce project by Hadoop, Nutch, AWS EMR, Pig, Tez, Hive
Stars: ✭ 18 (-21.74%)
Mutual labels:  mapreduce
Data-pipeline-project
Data pipeline project
Stars: ✭ 18 (-21.74%)
Mutual labels:  mapreduce
web-click-flow
网站点击流离线日志分析
Stars: ✭ 14 (-39.13%)
Mutual labels:  mapreduce
HadoopDedup
🍉基于Hadoop和HBase的大规模海量数据去重
Stars: ✭ 27 (+17.39%)
Mutual labels:  mapreduce
interview-refresh-java-bigdata
a one-stop repo to lookup for code snippets of core java concepts, sql, data structures as well as big data. It also consists of interview questions asked in real-life.
Stars: ✭ 25 (+8.7%)
Mutual labels:  mapreduce
ooso
Java library for running Serverless MapReduce jobs
Stars: ✭ 25 (+8.7%)
Mutual labels:  mapreduce
lectures-hse-spark
Масштабируемое машинное обучение и анализ больших данных с Apache Spark
Stars: ✭ 20 (-13.04%)
Mutual labels:  mapreduce
infantry
Run MapReduce in user's browser.
Stars: ✭ 14 (-39.13%)
Mutual labels:  mapreduce
bigdata-doc
大数据学习笔记,学习路线,技术案例整理。
Stars: ✭ 37 (+60.87%)
Mutual labels:  mapreduce
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+69.57%)
Mutual labels:  hadoop-mapreduce
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (+47.83%)
Mutual labels:  mapreduce
mapreduce
A in-process MapReduce library to help you optimizing service response time or concurrent task processing.
Stars: ✭ 93 (+304.35%)
Mutual labels:  mapreduce
MLBD
Materials for "Machine Learning on Big Data" course
Stars: ✭ 20 (-13.04%)
Mutual labels:  mapreduce

MapReduce Examples

MapReduce is the key programming model for data processing in the Hadoop ecosystem. This repository is used to collect the problems applicable by MapReduce.

  • Summarization Patterns

    • Word Count
    • Inverted Index (demo Tool, ToolRunner)
    • Matrix-vector Multiplication (demo MultipleInputs)
    • Matrix-matrix Multiplication
  • Filtering Patterns

    • Anagram
    • Top K
    • Sentiment Analysis
  • Organization Patterns

    • Partial Sort
    • Secondary Sort
  • Join Patterns

  • Metapatterns

    • NGramAutocomplete
    • Page Rank
    • Recommender System
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].