ElassandraElassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (+1312.28%)
splinkImplementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (+58.77%)
RoaringbitmapA better compressed bitset in Java
Stars: ✭ 2,460 (+2057.89%)
experimentsCode examples for my blog posts
Stars: ✭ 21 (-81.58%)
Spark On K8s OperatorKubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (+1461.4%)
BigdataclassTwo-day workshop that covers how to use R to interact databases and Spark
Stars: ✭ 110 (-3.51%)
Spark FfmFFM (Field-Awared Factorization Machine) on Spark
Stars: ✭ 101 (-11.4%)
FlintWebex Bot SDK for Node.js (deprecated in favor of https://github.com/webex/webex-bot-node-framework)
Stars: ✭ 85 (-25.44%)
Blog基于SpringBoot+Thymeleaf+Mybatis+LayUi+Lucene的粗糙个人博客
Stars: ✭ 95 (-16.67%)
Seldon ServerMachine Learning Platform and Recommendation Engine built on Kubernetes
Stars: ✭ 1,435 (+1158.77%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+1528.07%)
Ammonite SparkRun spark calculations from Ammonite
Stars: ✭ 88 (-22.81%)
SplashSplash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Stars: ✭ 105 (-7.89%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (-1.75%)
Springboot Templatesspringboot和dubbo、netty的集成,redis mongodb的nosql模板, kafka rocketmq rabbit的MQ模板, solr solrcloud elasticsearch查询引擎
Stars: ✭ 100 (-12.28%)
Spark StatesCustom state store providers for Apache Spark
Stars: ✭ 83 (-27.19%)
Hadoop cookbookCookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-28.07%)
MleapMLeap: Deploy ML Pipelines to Production
Stars: ✭ 1,232 (+980.7%)
AlmondA Scala kernel for Jupyter
Stars: ✭ 1,354 (+1087.72%)
Spark GbtlrHybrid model of Gradient Boosting Trees and Logistic Regression (GBDT+LR) on Spark
Stars: ✭ 81 (-28.95%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+1073.68%)
Flink Learningflink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
Stars: ✭ 11,378 (+9880.7%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-19.3%)
Lambda ArchApplying Lambda Architecture with Spark, Kafka, and Cassandra.
Stars: ✭ 111 (-2.63%)
Big Data🔧 Use dplyr to analyze Big Data 🐘
Stars: ✭ 93 (-18.42%)
LogigskA Linux based software package to control led's on Logitech G910, G810, G610 and G410.
Stars: ✭ 107 (-6.14%)
FingerprintsMake it easier to compare and cross-reference the names of companies and people by applying strong normalisation.
Stars: ✭ 91 (-20.18%)
Ik Analyzer支持Lucene5/6/7/8+版本, 长期维护。
Stars: ✭ 112 (-1.75%)
Jeeplatform一款企业信息化开发基础平台,拟集成OA(办公自动化)、CMS(内容管理系统)等企业系统的通用业务功能 JeePlatform项目是一款以SpringBoot为核心框架,集ORM框架Mybatis,Web层框架SpringMVC和多种开源组件框架而成的一款通用基础平台,代码已经捐赠给开源中国社区
Stars: ✭ 1,285 (+1027.19%)
SparktutorialSource code for James Lee's Aparch Spark with Java course
Stars: ✭ 105 (-7.89%)
Spark Nlp ModelsModels and Pipelines for the Spark NLP library
Stars: ✭ 88 (-22.81%)
SolrpluginsDice Solr Plugins from Simon Hughes Dice.com
Stars: ✭ 86 (-24.56%)
CuesheetA framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-24.56%)
Hops ExamplesExamples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-26.32%)
SmartstoreOpen Source ASP.NET Core Enterprise eCommerce Shopping Cart Solution
Stars: ✭ 82 (-28.07%)
Parquet IndexSpark SQL index for Parquet tables
Stars: ✭ 109 (-4.39%)
LeharVisualize data using relative ordering
Stars: ✭ 81 (-28.95%)
ArchivesparkAn Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
Stars: ✭ 111 (-2.63%)
StrawA platform for real-time streaming search
Stars: ✭ 98 (-14.04%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-30.7%)
Docker Spark🚢 Docker image for Apache Spark
Stars: ✭ 78 (-31.58%)
HomeApacheCN 开源组织:公告、介绍、成员、活动、交流方式
Stars: ✭ 1,199 (+951.75%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-5.26%)
LogislandScalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-14.91%)
Laravel Lucene SearchLaravel 4.2, 5.* package for full-text search over Eloquent models based on ZF2 Lucene.
Stars: ✭ 75 (-34.21%)
SchemerSchema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-14.91%)
Spring Shiro SparkSpring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试
Stars: ✭ 114 (+0%)