DigitrecognizerJava Convolutional Neural Network example for Hand Writing Digit Recognition
Stars: ✭ 23 (-70.51%)
Spark BigqueryGoogle BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
Stars: ✭ 65 (-16.67%)
Hadoop study定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Stars: ✭ 567 (+626.92%)
Spark DariaEssential Spark extensions and helper methods ✨😲
Stars: ✭ 553 (+608.97%)
Vagrant ProjectsVagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR
Stars: ✭ 34 (-56.41%)
LopqTraining of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Stars: ✭ 530 (+579.49%)
Zemberek Nlp ServerZemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu
Stars: ✭ 60 (-23.08%)
CdapAn open source framework for building data analytic applications.
Stars: ✭ 509 (+552.56%)
AkkeeperAn easy way to deploy your Akka services to a distributed environment.
Stars: ✭ 30 (-61.54%)
Bigdata💎🔥大数据学习笔记
Stars: ✭ 488 (+525.64%)
Docker HadoopApache Hadoop docker image
Stars: ✭ 1,190 (+1425.64%)
School Of SreAt LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.
Stars: ✭ 5,141 (+6491.03%)
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+1123.08%)
Basehttps://www.researchgate.net/profile/Rajah_Iyer
Stars: ✭ 48 (-38.46%)
Pyspark ExamplesCode examples on Apache Spark using python
Stars: ✭ 58 (-25.64%)
AtsdAxibase Time Series Database Documentation
Stars: ✭ 68 (-12.82%)
LabsResearch on distributed system
Stars: ✭ 73 (-6.41%)
JumbuneJumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-17.95%)
Awesome Recommendation EngineThe purpose of this tiny project is to put things together with the know how that i learned from the course big data expert from formacionhadoop.com The idea is to show how to play with apache spark streaming, kafka,mongo, spark machine learning algorithms.
Stars: ✭ 47 (-39.74%)
HeraclesHigh performance HBase / Spark SQL engine
Stars: ✭ 27 (-65.38%)
Dji Firmware ToolsTools for handling firmwares of DJI products, with focus on quadcopters.
Stars: ✭ 424 (+443.59%)
FeatranA Scala feature transformation library for data science and machine learning
Stars: ✭ 420 (+438.46%)
Agile data code 2Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition
Stars: ✭ 413 (+429.49%)
TedsdsApache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-82.05%)
Sparkling WaterSparkling Water provides H2O functionality inside Spark cluster
Stars: ✭ 887 (+1037.18%)
Awesome PulsarA curated list of Pulsar tools, integrations and resources.
Stars: ✭ 57 (-26.92%)
KontextfreiWriting application logic for Spark jobs that can be unit-tested without a SparkContext
Stars: ✭ 67 (-14.1%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-17.95%)
Spark TdaSparkTDA is a package for Apache Spark providing Topological Data Analysis Functionalities.
Stars: ✭ 45 (-42.31%)
Hadoop For GeoeventArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-93.59%)
Sparkling TitanicTraining models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-84.62%)
IgniteApache Ignite
Stars: ✭ 4,027 (+5062.82%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (-29.49%)
MareMaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.
Stars: ✭ 11 (-85.9%)
Lpa DetectorOptimize and improve the Label propagation algorithm
Stars: ✭ 75 (-3.85%)
Goodreads etl pipelineAn end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+916.67%)
MoosefsMooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+1214.1%)
Spark RedisA connector for Spark that allows reading and writing to/from Redis cluster
Stars: ✭ 773 (+891.03%)
Utils4sscala、spark使用过程中,各种测试用例以及相关资料整理
Stars: ✭ 1,070 (+1271.79%)
SparklyrR interface for Apache Spark
Stars: ✭ 775 (+893.59%)
Ds CheatsheetsList of Data Science Cheatsheets to rule the world
Stars: ✭ 9,452 (+12017.95%)
Kamu CliNext generation tool for decentralized exchange and transformation of semi-structured data
Stars: ✭ 69 (-11.54%)
Delta ArchitectureStreaming data changes to a Data Lake with Debezium and Delta Lake pipeline
Stars: ✭ 43 (-44.87%)
AngelA Flexible and Powerful Parameter Server for large-scale machine learning
Stars: ✭ 6,458 (+8179.49%)
Coding Now学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等
Stars: ✭ 750 (+861.54%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+855.13%)