All Projects → Shifu → Similar Projects or Alternatives

840 Open source projects that are alternatives of or similar to Shifu

bigdatatutorial
bigdatatutorial
Stars: ✭ 34 (-83.57%)
Mutual labels:  hadoop, bigdata
Javaorbigdata Interview
Java开发者或者大数据开发者面试知识点整理
Stars: ✭ 203 (-1.93%)
Mutual labels:  hadoop, bigdata
Big data architect skills
一个大数据架构师应该掌握的技能
Stars: ✭ 400 (+93.24%)
Mutual labels:  hadoop, bigdata
learning-spark
Tidy up Spark and Hadoop tutorials.
Stars: ✭ 28 (-86.47%)
Mutual labels:  hadoop, bigdata
Awesome Learning
实践源码库:https://github.com/jast90/bigdata 。 微信搜索Jast关注公众号,获取最新技术分享😯。
Stars: ✭ 197 (-4.83%)
Mutual labels:  hadoop, bigdata
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+2802.42%)
Mutual labels:  hadoop, bigdata
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-87.92%)
Mutual labels:  hadoop, pipeline
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+3.86%)
Mutual labels:  hadoop, bigdata
hadoopoffice
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (-72.95%)
Mutual labels:  hadoop, bigdata
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-93.72%)
Mutual labels:  hadoop, bigdata
TIL
Today I Learned
Stars: ✭ 43 (-79.23%)
Mutual labels:  hadoop, pipeline
Apache Spark Hands On
Educational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-64.25%)
Mutual labels:  hadoop, bigdata
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+2632.37%)
Mutual labels:  hadoop, random-forest
the-apache-ignite-book
All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Stars: ✭ 65 (-68.6%)
Mutual labels:  hadoop, bigdata
bigdata-doc
大数据学习笔记,学习路线,技术案例整理。
Stars: ✭ 37 (-82.13%)
Mutual labels:  hadoop, bigdata
STOCK-RETURN-PREDICTION-USING-KNN-SVM-GUASSIAN-PROCESS-ADABOOST-TREE-REGRESSION-AND-QDA
Forecast stock prices using machine learning approach. A time series analysis. Employ the Use of Predictive Modeling in Machine Learning to Forecast Stock Return. Approach Used by Hedge Funds to Select Tradeable Stocks
Stars: ✭ 94 (-54.59%)
Mutual labels:  pipeline, random-forest
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-83.57%)
Mutual labels:  hadoop, bigdata
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+314.01%)
Mutual labels:  hadoop, bigdata
Hadoop Attack Library
A collection of pentest tools and resources targeting Hadoop environments
Stars: ✭ 228 (+10.14%)
Mutual labels:  hadoop, bigdata
dockerfiles
Multi docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (-85.99%)
Mutual labels:  hadoop, bigdata
qs-hadoop
大数据生态圈学习
Stars: ✭ 18 (-91.3%)
Mutual labels:  hadoop, bigdata
yuzhouwan
Code Library for My Blog
Stars: ✭ 39 (-81.16%)
Mutual labels:  hadoop, bigdata
flokkr
Documentation placeholder and utilities for all the other containers.
Stars: ✭ 30 (-85.51%)
Mutual labels:  hadoop, bigdata
Bigdataguide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+294.69%)
Mutual labels:  hadoop, bigdata
Hadoop For Geoevent
ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-97.58%)
Mutual labels:  hadoop, bigdata
Bigdata Notebook
Stars: ✭ 100 (-51.69%)
Mutual labels:  hadoop, bigdata
Spline
Data Lineage Tracking And Visualization Solution
Stars: ✭ 306 (+47.83%)
Mutual labels:  hadoop, bigdata
Bigdata Notes
大数据入门指南 ⭐
Stars: ✭ 10,991 (+5209.66%)
Mutual labels:  hadoop, bigdata
Hadoopcryptoledger
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-39.13%)
Mutual labels:  hadoop, bigdata
Flinkx
Based on Apache Flink. support data synchronization/integration and streaming SQL computation.
Stars: ✭ 2,651 (+1180.68%)
Mutual labels:  bigdata
Zumis
zUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs
Stars: ✭ 178 (-14.01%)
Mutual labels:  pipeline
Bigdata practice
大数据分析可视化实践
Stars: ✭ 166 (-19.81%)
Mutual labels:  bigdata
Dolphinbeat
A server that pulls and parses MySQL binlog, pushs change data into different sinks like Kafka.
Stars: ✭ 164 (-20.77%)
Mutual labels:  pipeline
Drone Cache
A Drone plugin for caching current workspace files between builds to reduce your build times
Stars: ✭ 194 (-6.28%)
Mutual labels:  pipeline
Bigdata Playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (-14.49%)
Mutual labels:  hadoop
Cloud Dev
云研发,是一种生于云上的闭环 + 代码化的软件开发方式。它可以让业务人员、开发人员、运营人员等在同一个云端共同协作、透明化地完成整个软件的生命周期(需求、设计、编码、构建、部署、运营),而非相互隔离,又或者是借助于多个软件才能完成工作。
Stars: ✭ 164 (-20.77%)
Mutual labels:  pipeline
Big Whale
Spark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (-21.26%)
Mutual labels:  hadoop
Proposal Smart Pipelines
Old archived draft proposal for smart pipelines. Go to the new Hack-pipes proposal at js-choi/proposal-hack-pipes.
Stars: ✭ 177 (-14.49%)
Mutual labels:  pipeline
Bigdata docker
Big Data Ecosystem Docker
Stars: ✭ 161 (-22.22%)
Mutual labels:  hadoop
Core
The safe post-production pipeline - https://getavalon.github.io/2.0
Stars: ✭ 162 (-21.74%)
Mutual labels:  pipeline
Jenkinsdocs
Jenkins实践文档 最新站点地址: http://www.idevops.site
Stars: ✭ 200 (-3.38%)
Mutual labels:  pipeline
Nutch
Apache Nutch is an extensible and scalable web crawler
Stars: ✭ 2,277 (+1000%)
Mutual labels:  hadoop
Tensorflow Ml Nlp
텐서플로우와 머신러닝으로 시작하는 자연어처리(로지스틱회귀부터 트랜스포머 챗봇까지)
Stars: ✭ 176 (-14.98%)
Mutual labels:  random-forest
Operator
Kubernetes operator to manage installation, updation and uninstallation of tektoncd projects (pipeline, …)
Stars: ✭ 161 (-22.22%)
Mutual labels:  pipeline
Machine Learning Models
Decision Trees, Random Forest, Dynamic Time Warping, Naive Bayes, KNN, Linear Regression, Logistic Regression, Mixture Of Gaussian, Neural Network, PCA, SVD, Gaussian Naive Bayes, Fitting Data to Gaussian, K-Means
Stars: ✭ 160 (-22.71%)
Mutual labels:  random-forest
Chefboost
A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python
Stars: ✭ 176 (-14.98%)
Mutual labels:  random-forest
Java Notes
☕️ Java 基础 👫 面向对象思想✏️ 算法 📝 操作系统 ☁️ 网络 💾 数据库 🙊 Spring 💡 系统架构🐘大数据
Stars: ✭ 160 (-22.71%)
Mutual labels:  bigdata
Aws Serverless Cicd Workshop
Learn how to build a CI/CD pipeline for SAM-based applications
Stars: ✭ 158 (-23.67%)
Mutual labels:  pipeline
Pipeline.rs
☔️ => ⛅️ => ☀️
Stars: ✭ 188 (-9.18%)
Mutual labels:  pipeline
Pipelines
Machine Learning Pipelines for Kubeflow
Stars: ✭ 2,607 (+1159.42%)
Mutual labels:  pipeline
Presto
The official home of the Presto distributed SQL query engine for big data
Stars: ✭ 12,957 (+6159.42%)
Mutual labels:  hadoop
Spacy Wordnet
spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface
Stars: ✭ 156 (-24.64%)
Mutual labels:  pipeline
Randomforestexplainer
A set of tools to understand what is happening inside a Random Forest
Stars: ✭ 175 (-15.46%)
Mutual labels:  random-forest
Batchflow
BatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory.
Stars: ✭ 156 (-24.64%)
Mutual labels:  pipeline
Ects
Elastic Crontab System 简单易用的分布式定时任务管理系统
Stars: ✭ 156 (-24.64%)
Mutual labels:  pipeline
Recommendsys
推荐项目(实时推荐和离线推荐)
Stars: ✭ 198 (-4.35%)
Mutual labels:  hadoop
Hive Jdbc Uber Jar
Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
Stars: ✭ 188 (-9.18%)
Mutual labels:  hadoop
Machine Learning Is All You Need
🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!
Stars: ✭ 173 (-16.43%)
Mutual labels:  random-forest
Hadoop Common
Mirror of Apache Hadoop common
Stars: ✭ 155 (-25.12%)
Mutual labels:  hadoop
Fluids
Fluid dynamics component of Chemical Engineering Design Library (ChEDL)
Stars: ✭ 154 (-25.6%)
Mutual labels:  pipeline
1-60 of 840 similar projects