All Projects → Big Whale → Similar Projects or Alternatives

651 Open source projects that are alternatives of or similar to Big Whale

Spark.fish
▁▂▄▆▇█▇▆▄▂▁
Stars: ✭ 229 (+40.49%)
Mutual labels:  spark
Spark Workshop
Apache Spark™ and Scala Workshops
Stars: ✭ 224 (+37.42%)
Mutual labels:  spark
Ruby Spark
Ruby wrapper for Apache Spark
Stars: ✭ 221 (+35.58%)
Mutual labels:  spark
Sagemaker Spark
A Spark library for Amazon SageMaker.
Stars: ✭ 219 (+34.36%)
Mutual labels:  spark
Spark Excel
A Spark plugin for reading Excel files via Apache POI
Stars: ✭ 216 (+32.52%)
Mutual labels:  spark
Gimel
Big Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+32.52%)
Mutual labels:  spark
Hydro Serving
MLOps Platform
Stars: ✭ 213 (+30.67%)
Mutual labels:  spark
Example Spark
Spark, Spark Streaming and Spark SQL unit testing strategies
Stars: ✭ 205 (+25.77%)
Mutual labels:  spark
Spark Knn
k-Nearest Neighbors algorithm on Spark
Stars: ✭ 205 (+25.77%)
Mutual labels:  spark
Mmlspark
Simple and Distributed Machine Learning
Stars: ✭ 2,899 (+1678.53%)
Mutual labels:  spark
Spark Practice
Apache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+22.7%)
Mutual labels:  spark
Ballista
Distributed compute platform implemented in Rust, and powered by Apache Arrow.
Stars: ✭ 2,274 (+1295.09%)
Mutual labels:  spark
Scanns
A scalable nearest neighbor search library in Apache Spark
Stars: ✭ 190 (+16.56%)
Mutual labels:  spark
Js Spark
Realtime calculation distributed system. AKA distributed lodash
Stars: ✭ 187 (+14.72%)
Mutual labels:  spark
Azuredatabricksbestpractices
Version 1 of Technical Best Practices of Azure Databricks based on real world Customer and Technical SME inputs
Stars: ✭ 186 (+14.11%)
Mutual labels:  spark
Kotlin Spark Api
This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x
Stars: ✭ 183 (+12.27%)
Mutual labels:  spark
Roaringbitmap
A better compressed bitset in Java
Stars: ✭ 2,460 (+1409.2%)
Mutual labels:  spark
Spark Streaming With Kafka
Self-contained examples of Apache Spark streaming integrated with Apache Kafka.
Stars: ✭ 180 (+10.43%)
Mutual labels:  spark
Xsql
Unified SQL Analytics Engine Based on SparkSQL
Stars: ✭ 176 (+7.98%)
Mutual labels:  spark
Spark Kafka Writer
Write your Spark data to Kafka seamlessly
Stars: ✭ 175 (+7.36%)
Mutual labels:  spark
Kraps Rpc
A RPC framework leveraging Spark RPC module
Stars: ✭ 175 (+7.36%)
Mutual labels:  spark
Spark
Firely's open source FHIR server
Stars: ✭ 174 (+6.75%)
Mutual labels:  spark
Spark Nlp
State of the Art Natural Language Processing
Stars: ✭ 2,518 (+1444.79%)
Mutual labels:  spark
Transmogrifai
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
Stars: ✭ 2,084 (+1178.53%)
Mutual labels:  spark
Spark Structured Streaming Examples
Spark Structured Streaming / Kafka / Cassandra / Elastic
Stars: ✭ 168 (+3.07%)
Mutual labels:  spark
Spark Iforest
Isolation Forest on Spark
Stars: ✭ 166 (+1.84%)
Mutual labels:  spark
Geopyspark
GeoTrellis for PySpark
Stars: ✭ 167 (+2.45%)
Mutual labels:  spark
Azure Cosmosdb Spark
Apache Spark Connector for Azure Cosmos DB
Stars: ✭ 165 (+1.23%)
Mutual labels:  spark
Devops Bash Tools
550+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Kafka, Docker, APIs, Hadoop, SQL, PostgreSQL, MySQL, Hive, Impala, Travis CI, Jenkins, Concourse, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, .tmux.conf, .psqlrc ...
Stars: ✭ 226 (+38.65%)
Mutual labels:  hadoop
Hadoop Attack Library
A collection of pentest tools and resources targeting Hadoop environments
Stars: ✭ 228 (+39.88%)
Mutual labels:  hadoop
Luigi
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Stars: ✭ 15,226 (+9241.1%)
Mutual labels:  hadoop
Hadoop Connectors
Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
Stars: ✭ 218 (+33.74%)
Mutual labels:  hadoop
Calcite
Apache Calcite
Stars: ✭ 2,816 (+1627.61%)
Mutual labels:  hadoop
Facebook Hive Udfs
Facebook's Hive UDFs
Stars: ✭ 213 (+30.67%)
Mutual labels:  hadoop
Shifu
An end-to-end machine learning and data mining framework on Hadoop
Stars: ✭ 207 (+26.99%)
Mutual labels:  hadoop
Recommendsys
推荐项目(实时推荐和离线推荐)
Stars: ✭ 198 (+21.47%)
Mutual labels:  hadoop
Awesome Learning
实践源码库:https://github.com/jast90/bigdata 。 微信搜索Jast关注公众号,获取最新技术分享😯。
Stars: ✭ 197 (+20.86%)
Mutual labels:  hadoop
Nutch
Apache Nutch is an extensible and scalable web crawler
Stars: ✭ 2,277 (+1296.93%)
Mutual labels:  hadoop
Hive Jdbc Uber Jar
Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
Stars: ✭ 188 (+15.34%)
Mutual labels:  hadoop
Bigdata Playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+8.59%)
Mutual labels:  hadoop
Flink Doc Zh
Apache Flink 中文文档
Stars: ✭ 242 (+48.47%)
Mutual labels:  flink
Flink Boot
懒松鼠Flink-Boot 脚手架让Flink全面拥抱Spring生态体系,使得开发者可以以Java WEB开发模式开发出分布式运行的流处理程序,懒松鼠让跨界变得更加简单。懒松鼠旨在让开发者以更底上手成本(不需要理解分布式计算的理论知识和Flink框架的细节)便可以快速编写业务代码实现。为了进一步提升开发者使用懒松鼠脚手架开发大型项目的敏捷的度,该脚手架默认集成Spring框架进行Bean管理,同时将微服务以及WEB开发领域中经常用到的框架集成进来,进一步提升开发速度。比如集成Mybatis ORM框架,Hibernate Validator校验框架,Spring Retry重试框架等,具体见下面的脚手架特性。
Stars: ✭ 209 (+28.22%)
Mutual labels:  flink
Flink Recommandsystem Demo
🚁🚀基于Flink实现的商品实时推荐系统。flink统计商品热度,放入redis缓存,分析日志信息,将画像标签和实时记录放入Hbase。在用户发起推荐请求后,根据用户画像重排序热度榜,并结合协同过滤和标签两个推荐模块为新生成的榜单的每一个产品添加关联产品,最后返回新的用户列表。
Stars: ✭ 3,115 (+1811.04%)
Mutual labels:  flink
Flink Sql Cookbook
The Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are completely self-contained and can be run in Ververica Platform as is.
Stars: ✭ 189 (+15.95%)
Mutual labels:  flink
Flink Spector
Framework for Apache Flink unit tests
Stars: ✭ 190 (+16.56%)
Mutual labels:  flink
Registry
Schema Registry
Stars: ✭ 184 (+12.88%)
Mutual labels:  flink
Nussknacker
Process authoring tool for Apache Flink
Stars: ✭ 182 (+11.66%)
Mutual labels:  flink
Flink Commodity Recommendation System
🐳基于 Flink 的商品实时推荐系统。使用了 redis 缓存热点数据。当用户产生评分行为时,数据由 kafka 发送到 flink,根据用户历史评分行为进行实时和离线推荐。实时推荐包括:基于行为和实时热门,离线推荐包括:历史热门、历史优质商品和 itemcf 。
Stars: ✭ 167 (+2.45%)
Mutual labels:  flink
Flinkx
Based on Apache Flink. support data synchronization/integration and streaming SQL computation.
Stars: ✭ 2,651 (+1526.38%)
Mutual labels:  flink
Flink Clickhouse Sink
Flink sink for Clickhouse
Stars: ✭ 165 (+1.23%)
Mutual labels:  flink
Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
Stars: ✭ 2,936 (+1701.23%)
Mutual labels:  flink
601-651 of 651 similar projects