learning-sparkTidy up Spark and Hadoop tutorials.
Stars: ✭ 28 (-86.47%)
Awesome Learning实践源码库:https://github.com/jast90/bigdata 。 微信搜索Jast关注公众号,获取最新技术分享😯。
Stars: ✭ 197 (-4.83%)
God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+2802.42%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-87.92%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+3.86%)
hadoopofficeHadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (-72.95%)
leaflet heatmap简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-93.72%)
TILToday I Learned
Stars: ✭ 43 (-79.23%)
Apache Spark Hands OnEducational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-64.25%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+2632.37%)
the-apache-ignite-bookAll code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Stars: ✭ 65 (-68.6%)
big dataA collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-83.57%)
Bigdata Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+314.01%)
Hadoop Attack LibraryA collection of pentest tools and resources targeting Hadoop environments
Stars: ✭ 228 (+10.14%)
dockerfilesMulti docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (-85.99%)
yuzhouwanCode Library for My Blog
Stars: ✭ 39 (-81.16%)
flokkrDocumentation placeholder and utilities for all the other containers.
Stars: ✭ 30 (-85.51%)
Bigdataguide大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+294.69%)
Hadoop For GeoeventArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-97.58%)
SplineData Lineage Tracking And Visualization Solution
Stars: ✭ 306 (+47.83%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-39.13%)
FlinkxBased on Apache Flink. support data synchronization/integration and streaming SQL computation.
Stars: ✭ 2,651 (+1180.68%)
ZumiszUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs
Stars: ✭ 178 (-14.01%)
DolphinbeatA server that pulls and parses MySQL binlog, pushs change data into different sinks like Kafka.
Stars: ✭ 164 (-20.77%)
Drone CacheA Drone plugin for caching current workspace files between builds to reduce your build times
Stars: ✭ 194 (-6.28%)
Bigdata PlaygroundA complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (-14.49%)
Cloud Dev云研发,是一种生于云上的闭环 + 代码化的软件开发方式。它可以让业务人员、开发人员、运营人员等在同一个云端共同协作、透明化地完成整个软件的生命周期(需求、设计、编码、构建、部署、运营),而非相互隔离,又或者是借助于多个软件才能完成工作。
Stars: ✭ 164 (-20.77%)
Big WhaleSpark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (-21.26%)
Proposal Smart PipelinesOld archived draft proposal for smart pipelines. Go to the new Hack-pipes proposal at js-choi/proposal-hack-pipes.
Stars: ✭ 177 (-14.49%)
CoreThe safe post-production pipeline - https://getavalon.github.io/2.0
Stars: ✭ 162 (-21.74%)
Jenkinsdocs Jenkins实践文档 最新站点地址: http://www.idevops.site
Stars: ✭ 200 (-3.38%)
NutchApache Nutch is an extensible and scalable web crawler
Stars: ✭ 2,277 (+1000%)
Tensorflow Ml Nlp텐서플로우와 머신러닝으로 시작하는 자연어처리(로지스틱회귀부터 트랜스포머 챗봇까지)
Stars: ✭ 176 (-14.98%)
OperatorKubernetes operator to manage installation, updation and uninstallation of tektoncd projects (pipeline, …)
Stars: ✭ 161 (-22.22%)
Machine Learning ModelsDecision Trees, Random Forest, Dynamic Time Warping, Naive Bayes, KNN, Linear Regression, Logistic Regression, Mixture Of Gaussian, Neural Network, PCA, SVD, Gaussian Naive Bayes, Fitting Data to Gaussian, K-Means
Stars: ✭ 160 (-22.71%)
ChefboostA Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python
Stars: ✭ 176 (-14.98%)
Java Notes☕️ Java 基础 👫 面向对象思想✏️ 算法 📝 操作系统 ☁️ 网络 💾 数据库 🙊 Spring 💡 系统架构🐘大数据
Stars: ✭ 160 (-22.71%)
PipelinesMachine Learning Pipelines for Kubeflow
Stars: ✭ 2,607 (+1159.42%)
PrestoThe official home of the Presto distributed SQL query engine for big data
Stars: ✭ 12,957 (+6159.42%)
Spacy Wordnetspacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface
Stars: ✭ 156 (-24.64%)
RandomforestexplainerA set of tools to understand what is happening inside a Random Forest
Stars: ✭ 175 (-15.46%)
BatchflowBatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory.
Stars: ✭ 156 (-24.64%)
EctsElastic Crontab System 简单易用的分布式定时任务管理系统
Stars: ✭ 156 (-24.64%)
Hive Jdbc Uber JarHive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
Stars: ✭ 188 (-9.18%)
Machine Learning Is All You Need🔥🌟《Machine Learning 格物志》: ML + DL + RL basic codes and notes by sklearn, PyTorch, TensorFlow, Keras & the most important, from scratch!💪 This repository is ALL You Need!
Stars: ✭ 173 (-16.43%)
Hadoop CommonMirror of Apache Hadoop common
Stars: ✭ 155 (-25.12%)
FluidsFluid dynamics component of Chemical Engineering Design Library (ChEDL)
Stars: ✭ 154 (-25.6%)