H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+43407.69%)
Kafka Storm StarterCode examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (+5500%)
ImwindowWindow and GUI system based on Dear ImGui from OCornut
Stars: ✭ 574 (+4315.38%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+4769.23%)
Tiledb VcfEfficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (+100%)
Spark RedisA connector for Spark that allows reading and writing to/from Redis cluster
Stars: ✭ 773 (+5846.15%)
SparkctrCTR prediction model based on spark(LR, GBDT, DNN)
Stars: ✭ 740 (+5592.31%)
LayoutSingle-file library for calculating 2D UI layouts using stacking boxes. Compiles as C99 or C++.
Stars: ✭ 551 (+4138.46%)
KyloKylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Stars: ✭ 916 (+6946.15%)
Cimguic-api for imgui (https://github.com/ocornut/imgui) Look at: https://github.com/cimgui for other widgets
Stars: ✭ 707 (+5338.46%)
Dockerfiles50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu
Stars: ✭ 847 (+6415.38%)
ImogenGPU Texture Generator
Stars: ✭ 648 (+4884.62%)
CvuiA (very) simple UI lib built on top of OpenCV drawing primitives
Stars: ✭ 619 (+4661.54%)
MareMaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.
Stars: ✭ 11 (-15.38%)
ZeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+42307.69%)
Goodreads etl pipelineAn end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+6000%)
Glchaos.p3D GPUs Strange Attractors and Hypercomplex Fractals explorer - up to 256 Million particles in RealTime
Stars: ✭ 590 (+4438.46%)
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+7046.15%)
SparklearningLearning Apache spark,including code and data .Most part can run local.
Stars: ✭ 558 (+4192.31%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+5630.77%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+4038.46%)
DigitrecognizerJava Convolutional Neural Network example for Hand Writing Digit Recognition
Stars: ✭ 23 (+76.92%)
Cdhprojecthadoop各组件使用,持续更新
Stars: ✭ 733 (+5538.46%)
Imgui.netAn ImGui wrapper for .NET.
Stars: ✭ 848 (+6423.08%)
FramelessExpressive types for Spark.
Stars: ✭ 717 (+5415.38%)
HailScalable genomic data analysis.
Stars: ✭ 706 (+5330.77%)
GiuCross platform rapid GUI framework for golang based on Dear ImGui.
Stars: ✭ 862 (+6530.77%)
ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+5253.85%)
Code RedA Graphics Interface for DirectX12 and Vulkan
Stars: ✭ 27 (+107.69%)
FreestyleA cohesive & pragmatic framework of FP centric Scala libraries
Stars: ✭ 627 (+4723.08%)
Sparkling WaterSparkling Water provides H2O functionality inside Spark cluster
Stars: ✭ 887 (+6723.08%)
Dev SetupmacOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
Stars: ✭ 5,590 (+42900%)
Sparkling TitanicTraining models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-7.69%)
DatafusionDataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (+4600%)
Bigdataguide大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+6184.62%)
Imgui SfmlDear ImGui binding for use with SFML
Stars: ✭ 596 (+4484.62%)
Spark SwaggerSpark (http://sparkjava.com/) support for Swagger (https://swagger.io/)
Stars: ✭ 25 (+92.31%)
ImnodesA small, dependency-free node editor for dear imgui
Stars: ✭ 591 (+4446.15%)
RayguiA simple and easy-to-use immediate-mode gui library
Stars: ✭ 785 (+5938.46%)
Mongo SparkThe MongoDB Spark Connector
Stars: ✭ 588 (+4423.08%)
SparkjniA heterogeneous Apache Spark framework.
Stars: ✭ 11 (-15.38%)
AlluxioAlluxio, data orchestration for analytics and machine learning in the cloud
Stars: ✭ 5,379 (+41276.92%)
SparklyrR interface for Apache Spark
Stars: ✭ 775 (+5861.54%)
Spark DariaEssential Spark extensions and helper methods ✨😲
Stars: ✭ 553 (+4153.85%)
ChroniclerScala toolchain for InfluxDB
Stars: ✭ 24 (+84.62%)
AngelA Flexible and Powerful Parameter Server for large-scale machine learning
Stars: ✭ 6,458 (+49576.92%)
SpartanengineGame engine with an emphasis on architectual quality and performance
Stars: ✭ 869 (+6584.62%)
MlfeatureFeature engineering toolkit for Spark MLlib.
Stars: ✭ 12 (-7.69%)
Bigdata Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+6492.31%)
Coding Now学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等
Stars: ✭ 750 (+5669.23%)