Spark R Notebooks R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 109 (-95.89%)
Blog demosCSDN博客专家程序员欣宸的github,这里有四百多篇原创文章的详细分类和汇总,以及对应的源码,内容涉及Java、Docker、Kubernetes、DevOPS等方面
Stars: ✭ 1,030 (-61.15%)
StreamlineStreamLine - Streaming Analytics
Stars: ✭ 151 (-94.3%)
Reddit sse streamA Server Side Event stream to deliver Reddit comments and submissions in near real-time to a client.
Stars: ✭ 39 (-98.53%)
Awesome BigdataA curated list of awesome big data frameworks, ressources and other awesomeness.
Stars: ✭ 10,478 (+295.25%)
AutocrawlerGoogle, Naver multiprocess image web crawler (Selenium)
Stars: ✭ 957 (-63.9%)
TwitworkMonitor twitter stream
Stars: ✭ 133 (-94.98%)
Aws Auto Terminate Idle EmrAWS Auto Terminate Idle AWS EMR Clusters Framework is an AWS based solution using AWS CloudWatch and AWS Lambda using a Python script that is using Boto3 to terminate AWS EMR clusters that have been idle for a specified period of time.
Stars: ✭ 21 (-99.21%)
Tennis Crystal BallUltimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction
Stars: ✭ 107 (-95.96%)
Java Notes☕️ Java 基础 👫 面向对象思想✏️ 算法 📝 操作系统 ☁️ 网络 💾 数据库 🙊 Spring 💡 系统架构🐘大数据
Stars: ✭ 160 (-93.96%)
Liteflowliteflow是一个基于任务版本来实现的分布式任务流调度系统
Stars: ✭ 112 (-95.78%)
Athena CliPresto-like CLI tool for AWS Athena
Stars: ✭ 85 (-96.79%)
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (-64.96%)
Spark.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (-35.08%)
SplashSplash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Stars: ✭ 105 (-96.04%)
Coding Now学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等
Stars: ✭ 750 (-71.71%)
AthenacliAthenaCLI is a CLI tool for AWS Athena service that can do auto-completion and syntax highlighting.
Stars: ✭ 151 (-94.3%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (-71.9%)
VolcanoA Cloud Native Batch System (Project under CNCF)
Stars: ✭ 2,114 (-20.26%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (-49.53%)
JigsawJigsaw七巧板 provides a set of web components based on Angular5/8/9+. The main purpose of Jigsaw is to help the application developers to construct complex & intensive interacting & user friendly web pages. Jigsaw is supporting the development of all applications of Big Data Product of ZTE.
Stars: ✭ 354 (-86.65%)
CdsData syncing in golang for ClickHouse.
Stars: ✭ 501 (-81.1%)
EagleReal time data processing system based on flink and CEP
Stars: ✭ 95 (-96.42%)
YauaaYet Another UserAgent Analyzer
Stars: ✭ 472 (-82.2%)
Pulsar FlinkElastic data processing with Apache Pulsar and Apache Flink
Stars: ✭ 126 (-95.25%)
Bdp Dataplatform大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (-82.8%)
MnemonicApache Mnemonic - A non-volatile hybrid memory storage oriented library
Stars: ✭ 91 (-96.57%)
PoliAn easy-to-use BI server built for SQL lovers. Power data analysis in SQL and gain faster business insights.
Stars: ✭ 1,850 (-30.22%)
Api.rssRSS as RESTful. This service allows you to transform RSS feed into an awesome API.
Stars: ✭ 340 (-87.17%)
Circosjsd3 library to build circular graphs
Stars: ✭ 436 (-83.55%)
Ignite Book Code SamplesAll code samples, scripts and more in-depth examples for the book high performance in-memory computing with Apache Ignite. Please use the repository "the-apache-ignite-book" for Ignite version 2.6 or above.
Stars: ✭ 86 (-96.76%)
FeatranA Scala feature transformation library for data science and machine learning
Stars: ✭ 420 (-84.16%)
Flink DockerDocker packaging for Apache Flink
Stars: ✭ 118 (-95.55%)
MlsqlThe Programming Language Designed For Big Data and AI
Stars: ✭ 1,262 (-52.4%)
SidekickHigh Performance HTTP Sidecar Load Balancer
Stars: ✭ 366 (-86.19%)
Javainterview最全的Java技术知识点,以及Java源码分析。为开源贡献自己的一份力。
Stars: ✭ 154 (-94.19%)
SylphStream computing platform for bigdata
Stars: ✭ 362 (-86.34%)
Hops ExamplesExamples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-96.83%)
DatawaveDataWave is an ingest/query framework that leverages Apache Accumulo to provide fast, secure data access.
Stars: ✭ 347 (-86.91%)
GenieDistributed Big Data Orchestration Service
Stars: ✭ 1,544 (-41.76%)
DatafakerDatafaker is a large-scale test data and flow test data generation tool. Datafaker fakes data and inserts to varied data sources. 测试数据生成工具
Stars: ✭ 327 (-87.67%)
Uproot4ROOT I/O in pure Python and NumPy.
Stars: ✭ 80 (-96.98%)
Uproot3ROOT I/O in pure Python and NumPy.
Stars: ✭ 312 (-88.23%)
SplineData Lineage Tracking And Visualization Solution
Stars: ✭ 306 (-88.46%)
Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-94.72%)
Lambda ArchApplying Lambda Architecture with Spark, Kafka, and Cassandra.
Stars: ✭ 111 (-95.81%)
Cleanframestype-class based data cleansing library for Apache Spark SQL
Stars: ✭ 75 (-97.17%)
FlinkApache Flink is an open source project of The Apache Software Foundation (ASF).
The Apache Flink project originated from the Stratosphere research project.
Stars: ✭ 17,781 (+570.73%)
CloudflowCloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
Stars: ✭ 278 (-89.51%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (-29.99%)