SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (-11.89%)
RoaringbitmapA better compressed bitset in Java
Stars: ✭ 2,460 (+908.2%)
AztkAZTK powered by Azure Batch: On-demand, Dockerized, Spark Jobs on Azure
Stars: ✭ 152 (-37.7%)
Cc PysparkProcess Common Crawl data with Python and Spark
Stars: ✭ 147 (-39.75%)
Sparkstreaming💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算);🚀 支持运行过程中增删topic;🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。
Stars: ✭ 179 (-26.64%)
Technology Talk汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
Stars: ✭ 12,136 (+4873.77%)
Azure Event Hubs☁️ Cloud-scale telemetry ingestion from any stream of data with Azure Event Hubs
Stars: ✭ 233 (-4.51%)
RsparseFast and accurate machine learning on sparse matrices - matrix factorizations, regression, classification, top-N recommendations.
Stars: ✭ 145 (-40.57%)
SparkFirely's open source FHIR server
Stars: ✭ 174 (-28.69%)
Spark AuthorizerA Spark SQL extension which provides SQL Standard Authorization for Apache Spark
Stars: ✭ 141 (-42.21%)
Spark Knnk-Nearest Neighbors algorithm on Spark
Stars: ✭ 205 (-15.98%)
Data science blogsA repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (-43.03%)
Deeplearning4jSuite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learni…
Stars: ✭ 12,277 (+4931.56%)
RecotourA tour through recommendation algorithms in python [IN PROGRESS]
Stars: ✭ 140 (-42.62%)
Sparkling GraphSparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (-43.03%)
QuicksqlA Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Stars: ✭ 1,821 (+646.31%)
Apache Spark NodeNode.js bindings for Apache Spark DataFrame APIs
Stars: ✭ 136 (-44.26%)
RecommenderA C library for product recommendations/suggestions using collaborative filtering (CF)
Stars: ✭ 238 (-2.46%)
DeepconnThis is our implementation of DeepCoNN
Stars: ✭ 131 (-46.31%)
Flink Commodity Recommendation System🐳基于 Flink 的商品实时推荐系统。使用了 redis 缓存热点数据。当用户产生评分行为时,数据由 kafka 发送到 flink,根据用户历史评分行为进行实时和离线推荐。实时推荐包括:基于行为和实时热门,离线推荐包括:历史热门、历史优质商品和 itemcf 。
Stars: ✭ 167 (-31.56%)
AbrisAvro SerDe for Apache Spark structured APIs.
Stars: ✭ 130 (-46.72%)
MmlsparkSimple and Distributed Machine Learning
Stars: ✭ 2,899 (+1088.11%)
Spylon KernelJupyter kernel for scala and spark
Stars: ✭ 129 (-47.13%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+572.95%)
Sagemaker SparkA Spark library for Amazon SageMaker.
Stars: ✭ 219 (-10.25%)
FeastFeature Store for Machine Learning
Stars: ✭ 2,576 (+955.74%)
Big WhaleSpark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (-33.2%)
OpenubaA robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (-47.95%)
Spark PracticeApache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (-18.03%)
Cape PythonCollaborate on privacy-preserving policy for data science projects in Pandas and Apache Spark
Stars: ✭ 125 (-48.77%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-48.36%)
Spark ExcelA Spark plugin for reading Excel files via Apache POI
Stars: ✭ 216 (-11.48%)
BallistaDistributed compute platform implemented in Rust, and powered by Apache Arrow.
Stars: ✭ 2,274 (+831.97%)
LinkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+852.05%)
Spark Bigquery ConnectorBigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
Stars: ✭ 126 (-48.36%)
Scala SamplesThere are pieces of scala code that explain Scala syntax and related things - like what you can do with all this
Stars: ✭ 125 (-48.77%)
Vue Info CardSimple and beautiful card component with an elegant spark line, for VueJS.
Stars: ✭ 159 (-34.84%)
DeeprecommenderDeep learning for recommender systems
Stars: ✭ 1,593 (+552.87%)
Spark Infotheoretic Feature SelectionThis package contains a generic implementation of greedy Information Theoretic Feature Selection (FS) methods. The implementation is based on the common theoretic framework presented by Gavin Brown. Implementations of mRMR, InfoGain, JMI and other commonly used FS filters are provided.
Stars: ✭ 123 (-49.59%)
Django RecommendsA django app that builds item-based suggestions for users.
Stars: ✭ 194 (-20.49%)
GlowAn open-source toolkit for large-scale genomic analysis
Stars: ✭ 159 (-34.84%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-50%)
DeequDeequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Stars: ✭ 2,020 (+727.87%)
ZparkioBoiler plate framework to use Spark and ZIO together.
Stars: ✭ 121 (-50.41%)