All Projects → Data-pipeline-project → Similar Projects or Alternatives

1202 Open source projects that are alternatives of or similar to Data-pipeline-project

Behemoth

Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.

Stars: ✭ 286 (+1488.89%)

Mutual labels: hadoop, mapreduce

Avro Hadoop Starter

Example MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.

Stars: ✭ 110 (+511.11%)

Mutual labels: hadoop, mapreduce

Src

A light-weight distributed stream computing framework for Golang

Stars: ✭ 67 (+272.22%)

Mutual labels: hadoop, mapreduce

Akkeeper

An easy way to deploy your Akka services to a distributed environment.

Stars: ✭ 30 (+66.67%)

Mutual labels: hadoop, deployment

ob bulkstash

Bulk Stash is a docker rclone service to sync, or copy, files between different storage services. For example, you can copy files either to or from a remote storage services like Amazon S3 to Google Cloud Storage, or locally from your laptop to a remote storage.

Stars: ✭ 113 (+527.78%)

Mutual labels: amazon-web-services, data-pipeline

web-click-flow

网站点击流离线日志分析

Stars: ✭ 14 (-22.22%)

Mutual labels: hadoop, mapreduce

Trampoline

Admin Spring Boot Locally

Stars: ✭ 325 (+1705.56%)

Mutual labels: deployment, maven

Bigdata

💎🔥大数据学习笔记

Stars: ✭ 488 (+2611.11%)

Mutual labels: hadoop, mapreduce

GooglePlay-Web-Crawler

Mapreduce project by Hadoop, Nutch, AWS EMR, Pig, Tez, Hive

Stars: ✭ 18 (+0%)

Mutual labels: hadoop, mapreduce

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+122388.89%)

Mutual labels: hadoop, mapreduce

Data Algorithms Book

MapReduce, Spark, Java, and Scala for Data Algorithms Book

Stars: ✭ 949 (+5172.22%)

Mutual labels: hadoop, mapreduce

Asakusafw

Asakusa Framework

Stars: ✭ 114 (+533.33%)

Mutual labels: hadoop, mapreduce

bigdata-doc

大数据学习笔记，学习路线，技术案例整理。

Stars: ✭ 37 (+105.56%)

Mutual labels: hadoop, mapreduce

Cloudbreak

A tool for provisioning and managing Apache Hadoop clusters in the cloud. Cloudbreak, as part of the Hortonworks Data Platform, makes it easy to provision, configure and elastically grow HDP clusters on cloud infrastructure. Cloudbreak can be used to provision Hadoop across cloud infrastructure providers including AWS, Azure, GCP and OpenStack.

Stars: ✭ 301 (+1572.22%)

Mutual labels: hadoop, deployment

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+60961.11%)

Mutual labels: hadoop, mapreduce

qs-hadoop

大数据生态圈学习

Stars: ✭ 18 (+0%)

Mutual labels: hadoop, mapreduce

Ecs Deploy

Powerful CLI tool to simplify Amazon ECS deployments, rollbacks & scaling

Stars: ✭ 541 (+2905.56%)

Mutual labels: deployment, amazon-web-services

serverless-data-pipeline-sam

Serverless Data Pipeline powered by Kinesis Firehose, API Gateway, Lambda, S3, and Athena

Stars: ✭ 78 (+333.33%)

Mutual labels: amazon-web-services, data-pipeline

Repository

个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。

Stars: ✭ 92 (+411.11%)

Mutual labels: hadoop, mapreduce

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (+116.67%)

Mutual labels: hadoop, data-pipeline

skein

A tool and library for easily deploying applications on Apache YARN

Stars: ✭ 128 (+611.11%)

Mutual labels: hadoop, deployment

Cascading

Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster. See https://github.com/Cascading/cascading for the release repository.

Stars: ✭ 318 (+1666.67%)

Mutual labels: hadoop, mapreduce

big data

A collection of tutorials on Hadoop, MapReduce, Spark, Docker

Stars: ✭ 34 (+88.89%)

Mutual labels: hadoop, mapreduce

Bigdata Interview

🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Stars: ✭ 857 (+4661.11%)

Mutual labels: hadoop, mapreduce

gomrjob

gomrjob - a Go Framework for Hadoop Map Reduce Jobs

Stars: ✭ 39 (+116.67%)

Mutual labels: hadoop, mapreduce

learning-hadoop-and-spark

Companion to Learning Hadoop and Learning Spark courses on Linked In Learning

Stars: ✭ 146 (+711.11%)

Mutual labels: hadoop, mapreduce

Cruiser

A Pharo Tool to package applications

Stars: ✭ 41 (+127.78%)

Mutual labels: deployment

reproducible-build-maven-plugin

A simple Maven plugin to make your build byte-for-byte reproducible

Stars: ✭ 65 (+261.11%)

Mutual labels: maven

interview-refresh-java-bigdata

a one-stop repo to lookup for code snippets of core java concepts, sql, data structures as well as big data. It also consists of interview questions asked in real-life.

Stars: ✭ 25 (+38.89%)

Mutual labels: mapreduce

maven-shade-plugin

Apache Maven Shade Plugin

Stars: ✭ 120 (+566.67%)

Mutual labels: maven

vcredist

Lifecycle management for the Microsoft Visual C++ Redistributables

Stars: ✭ 91 (+405.56%)

Mutual labels: deployment

react-production-deployment

Deploy your React app to production on Netlify, Vercel and Heroku

Stars: ✭ 51 (+183.33%)

Mutual labels: deployment

library-booksystem

基于ssm的入门项目，图书在线管理系统。a library system.

Stars: ✭ 26 (+44.44%)

Mutual labels: maven

SpringsScala

Sample Projects for Creating Springs Web services in Scala

Stars: ✭ 16 (-11.11%)

Mutual labels: maven

QtRelease Windows

practice project，Helps with QT software deployment on Windows

Stars: ✭ 13 (-27.78%)

Mutual labels: deployment

ROSAppsDeployment

Deploying ROS apps using Docker

Stars: ✭ 14 (-22.22%)

Mutual labels: deployment

hive to es

同步Hive数据仓库数据到Elasticsearch的小工具

Stars: ✭ 21 (+16.67%)

Mutual labels: hadoop

mosec-maven-plugin

用于检测maven项目的第三方依赖组件是否存在安全漏洞。

Stars: ✭ 85 (+372.22%)

Mutual labels: maven

Temps

λ A selfhostable serverless function runtime. Inspired by zeit now.

Stars: ✭ 15 (-16.67%)

Mutual labels: deployment

iis

Information Inference Service of the OpenAIRE system

Stars: ✭ 16 (-11.11%)

Mutual labels: hadoop

WorldGuardExtraFlagsPlugin

Extension for the WorldGuard plugin.

Stars: ✭ 47 (+161.11%)

Mutual labels: maven

HDFS-Netdisc

基于Hadoop的分布式云存储系统 🌴

Stars: ✭ 56 (+211.11%)

Mutual labels: hadoop

maven-artifacts-uploader

command line tool for uploading directory of maven artifacts to nexus 3.x repository

Stars: ✭ 30 (+66.67%)

Mutual labels: maven

catacomb

The simplest machine learning library for launching UIs, running evaluations, and comparing model performance.

Stars: ✭ 13 (-27.78%)

Mutual labels: deployment

selling-partner-sdk

Amazon Selling Partner JAVA SDK SP API

Stars: ✭ 15 (-16.67%)

Mutual labels: amazon-web-services

hive-bigquery-storage-handler

Hive Storage Handler for interoperability between BigQuery and Apache Hive

Stars: ✭ 16 (-11.11%)

Mutual labels: hadoop

ktlint-maven-plugin

Maven plugin for ktlint the Kotlin linter

Stars: ✭ 42 (+133.33%)

Mutual labels: maven

node-casperjs-aws-lambda

Base scaffolding app for a casperjs/phantomjs app running on Amazon (AWS) Lambda

Stars: ✭ 52 (+188.89%)

Mutual labels: amazon-web-services

etran

Erlang Parse Transforms Including Fold (MapReduce) comprehension, Elixir-like Pipeline, and default function arguments

Stars: ✭ 19 (+5.56%)

Mutual labels: mapreduce

machine-learning-data-pipeline

Pipeline module for parallel real-time data processing for machine learning models development and production purposes.

Stars: ✭ 22 (+22.22%)

Mutual labels: data-pipeline

HadoopDedup

🍉基于Hadoop和HBase的大规模海量数据去重

Stars: ✭ 27 (+50%)

Mutual labels: mapreduce

Student-Information-Administration-System

大学生信息管理系统——初学路上自己摸索实践的项目

Stars: ✭ 91 (+405.56%)

Mutual labels: maven

primrose

Primrose modeling framework for simple production models

Stars: ✭ 33 (+83.33%)

Mutual labels: deployment

yoda

Simple tool to dockerize and manage deployment of your project

Stars: ✭ 69 (+283.33%)

Mutual labels: deployment

aws-compute-decision-tree

A decision tree to help you decide on the right AWS compute service for your needs.

Stars: ✭ 25 (+38.89%)

Mutual labels: amazon-web-services

ebook-continuous-delivery-with-kubernetes-and-jenkins

Continuous Delivery for Java Apps: Build a CD Pipeline Step by Step Using Kubernetes, Docker, Vagrant, Jenkins, Spring, Maven and Artifactory

Stars: ✭ 39 (+116.67%)

Mutual labels: maven

maven-resource

Maven Repository Manager Concourse Resource

Stars: ✭ 22 (+22.22%)

Mutual labels: maven

spring-boot-web

Spring Boot脚手架工程

Stars: ✭ 29 (+61.11%)

Mutual labels: maven

terraform-aws-route53

A Terraform module to create a Route53 Domain Name System (DNS) on Amazon Web Services (AWS). https://aws.amazon.com/route53/

Stars: ✭ 39 (+116.67%)

Mutual labels: amazon-web-services

atguigu ssm crud

Atguigu-SSM-CRUD 一个最基本的CRUD系统，采用IDEA+Maven搭建，具备前后端交互功能，前端采用BootStrap+Ajax异步请求DOM渲染，后端采用SpringMVC+MyBatis+Mysql8.0+Servlet+Jsp，符合REST风格URL规范，并加入了Hibernate提供的数据校验功能，支持PageHelper的分页功能，很适合SSM阶段性练习。同时用到了很多前端操作以及BootStrap组件，也有利于学习JS和前端框架。

Stars: ✭ 52 (+188.89%)

Mutual labels: maven

1-60 of 1202 similar projects

›

next*5