Bdp Dataplatform大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (+1069.23%)
Pulsar SparkWhen Apache Pulsar meets Apache Spark
Stars: ✭ 55 (+41.03%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (+135.9%)
Kamu CliNext generation tool for decentralized exchange and transformation of semi-structured data
Stars: ✭ 69 (+76.92%)
Hops ExamplesExamples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (+115.38%)
WirbelsturmWirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Stars: ✭ 332 (+751.28%)
FeatranA Scala feature transformation library for data science and machine learning
Stars: ✭ 420 (+976.92%)
StreamlineStreamLine - Streaming Analytics
Stars: ✭ 151 (+287.18%)
RegistrySchema Registry
Stars: ✭ 184 (+371.79%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+2964.1%)
QuicksqlA Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Stars: ✭ 1,821 (+4569.23%)
CloudflowCloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
Stars: ✭ 278 (+612.82%)
Bigdata Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+2097.44%)
HadoopcryptoledgerHadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (+223.08%)
Sparkstreaming💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算);🚀 支持运行过程中增删topic;🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。
Stars: ✭ 179 (+358.97%)
Pulsar FlinkElastic data processing with Apache Pulsar and Apache Flink
Stars: ✭ 126 (+223.08%)
Big WhaleSpark、Flink等离线任务的调度以及实时任务的监控
Stars: ✭ 163 (+317.95%)
Flink Learningflink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
Stars: ✭ 11,378 (+29074.36%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+4658.97%)
God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+15305.13%)
ZeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+14035.9%)
Bigdataguide大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+1994.87%)
fastdata-clusterFast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-48.72%)
Kafka Storm StarterCode examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (+1766.67%)
MlfeatureFeature engineering toolkit for Spark MLlib.
Stars: ✭ 12 (-69.23%)
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+2346.15%)
Cdev ServerDevelopment REST API for InterSystems Caché 2014.1+
Stars: ✭ 11 (-71.79%)
MareMaRe leverages the power of Docker and Spark to run and scale your serial tools in MapReduce fashion.
Stars: ✭ 11 (-71.79%)
GridA Lightning Component grid implementation that expects a server-side data store.
Stars: ✭ 35 (-10.26%)
PucketBucketing and partitioning system for Parquet
Stars: ✭ 29 (-25.64%)
SparkjniA heterogeneous Apache Spark framework.
Stars: ✭ 11 (-71.79%)
Data Algorithms Book MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (+2333.33%)
Hazelcast JetDistributed Stream and Batch Processing
Stars: ✭ 855 (+2092.31%)
Vagrant ProjectsVagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR
Stars: ✭ 34 (-12.82%)
Storm Camel ExampleReal-time analysis and visualization with Storm-AMQ-Camel-Websockets-Highcharts integration.
Stars: ✭ 28 (-28.21%)
Dockerfiles50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu
Stars: ✭ 847 (+2071.79%)
Sfdc Debug LogsBrowser extension for Salesforce logs management
Stars: ✭ 28 (-28.21%)
Apex ChainableChain Batches in a readable and flexible way without hardcoding the successor.
Stars: ✭ 27 (-30.77%)
Cogstack PipelineDistributed, fault tolerant batch processing for Natural Language Applications and Search, using remote partitioning
Stars: ✭ 26 (-33.33%)
Sendgrid ApexSendGrid (http://sendgrid.com) Apex helper library.
Stars: ✭ 33 (-15.38%)
TweetmapA real time Tweet Trend Map and Sentiment Analysis web application with kafka, Angular, Spring Boot, Flink, Elasticsearch, Kibana, Docker and Kubernetes deployed on the cloud
Stars: ✭ 28 (-28.21%)
Tiledb VcfEfficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-33.33%)
Stormtweetssentimentd3vizComputes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.
Stars: ✭ 25 (-35.9%)
HeraclesHigh performance HBase / Spark SQL engine
Stars: ✭ 27 (-30.77%)
Spark SwaggerSpark (http://sparkjava.com/) support for Swagger (https://swagger.io/)
Stars: ✭ 25 (-35.9%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+2428.21%)