LikelikeAn implementation of locality sensitive hashing with Hadoop
Stars: ✭ 58 (-34.09%)
TezApache Tez
Stars: ✭ 313 (+255.68%)
Stormtweetssentimentd3vizComputes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.
Stars: ✭ 25 (-71.59%)
SplineData Lineage Tracking And Visualization Solution
Stars: ✭ 306 (+247.73%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+1257.95%)
ElasticlusterCreate clusters of VMs on the cloud and configure them with Ansible.
Stars: ✭ 298 (+238.64%)
Android NosqlLightweight, simple structured NoSQL database for Android
Stars: ✭ 284 (+222.73%)
Docker HadoopA Docker container with a full Hadoop cluster setup with Spark and Zeppelin
Stars: ✭ 54 (-38.64%)
Hadoop Mini Clustershadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE
Stars: ✭ 265 (+201.14%)
Hadoop For GeoeventArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-94.32%)
pulsephData Pulse application log aggregation and monitoring
Stars: ✭ 13 (-85.23%)
knitDeprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead
Stars: ✭ 53 (-39.77%)
Winutilswinutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
Stars: ✭ 657 (+646.59%)
bigdata-funA complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-84.09%)
Basehttps://www.researchgate.net/profile/Rajah_Iyer
Stars: ✭ 48 (-45.45%)
XLearning-GPUqihoo360 xlearning with GPU support; AI on Hadoop
Stars: ✭ 22 (-75%)
TonyTonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Stars: ✭ 626 (+611.36%)
leaflet heatmap简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-85.23%)
Apache Spark Hands OnEducational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-15.91%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+6327.27%)
Nagios Plugins450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (+1036.36%)
swordfishOpen-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (-60.23%)
AlluxioAlluxio, data orchestration for analytics and machine learning in the cloud
Stars: ✭ 5,379 (+6012.5%)
jumbo🐘 A local Hadoop cluster bootstrapper using Vagrant, Ansible, and Ambari.
Stars: ✭ 17 (-80.68%)
TILToday I Learned
Stars: ✭ 43 (-51.14%)
Bigdata💎🔥大数据学习笔记
Stars: ✭ 488 (+454.55%)
TitanDataOperationSystem最好的大数据项目。《Titan数据运营系统》,本项目是一个全栈闭环系统,我们有用作数据可视化的web系统,然后用flume-kafaka-flume进行日志的读取,在hive设计数仓,编写spark代码进行数仓表之间的转化以及ads层表到mysql的迁移,使用azkaban进行定时任务的调度,使用技术:Java/Scala语言,Hadoop、Spark、Hive、Kafka、Flume、Azkaban、SpringBoot,Bootstrap, Echart等;
Stars: ✭ 62 (-29.55%)
School Of SreAt LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.
Stars: ✭ 5,141 (+5742.05%)
AtsdAxibase Time Series Database Documentation
Stars: ✭ 68 (-22.73%)
cloud云计算之hadoop、hive、hue、oozie、sqoop、hbase、zookeeper环境搭建及配置文件
Stars: ✭ 48 (-45.45%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+24954.55%)
cmuxA set of commands for managing CDH clusters using Cloudera Manager REST API.
Stars: ✭ 34 (-61.36%)
AkkeeperAn easy way to deploy your Akka services to a distributed environment.
Stars: ✭ 30 (-65.91%)
platys-modern-data-platformSupport for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....
Stars: ✭ 35 (-60.23%)
MarmarayGeneric Data Ingestion & Dispersal Library for Hadoop
Stars: ✭ 414 (+370.45%)
ChukwaMirror of Apache Chukwa
Stars: ✭ 77 (-12.5%)
clusterdockclusterdock is a framework for creating Docker-based container clusters
Stars: ✭ 26 (-70.45%)
fsbrowserFast desktop client for Hadoop Distributed File System
Stars: ✭ 27 (-69.32%)
Storm Camel ExampleReal-time analysis and visualization with Storm-AMQ-Camel-Websockets-Highcharts integration.
Stars: ✭ 28 (-68.18%)
IcebergIceberg is a table format for large, slow-moving tabular data
Stars: ✭ 393 (+346.59%)
clickhouse hadoopImport data from clickhouse to hadoop with pure SQL
Stars: ✭ 26 (-70.45%)
JumbuneJumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-27.27%)
IgniteApache Ignite
Stars: ✭ 4,027 (+4476.14%)
Hadoop cookbookCookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-6.82%)
CamusMirror of Linkedin's Camus
Stars: ✭ 81 (-7.95%)
Tf YarnTrain TensorFlow models on YARN in just a few lines of code!
Stars: ✭ 76 (-13.64%)
WaimakWaimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-31.82%)
Bigdata Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+873.86%)
WedatasphereWeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (+322.73%)