BigdlBuilding Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+2788.64%)
FreestyleA cohesive & pragmatic framework of FP centric Scala libraries
Stars: ✭ 627 (+375%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+805.3%)
Airflow PipelineAn Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
Stars: ✭ 128 (-3.03%)
Nagios Plugins450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (+657.58%)
WirbelsturmWirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Stars: ✭ 332 (+151.52%)
AbrisAvro SerDe for Apache Spark structured APIs.
Stars: ✭ 130 (-1.52%)
Docker Spark🚢 Docker image for Apache Spark
Stars: ✭ 78 (-40.91%)
Hadoop cookbookCookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-37.88%)
LogislandScalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-26.52%)
Technology Talk汇总java生态圈常用技术框架、开源中间件,系统架构、数据库、大公司架构案例、常用三方类库、项目管理、线上问题排查、个人成长、思考等知识
Stars: ✭ 12,136 (+9093.94%)
Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (+6.06%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+1306.06%)
Eel SdkBig Data Toolkit for the JVM
Stars: ✭ 140 (+6.06%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+62.88%)
spark-utillow-level helpers for Apache Spark libraries and tests
Stars: ✭ 16 (-87.88%)
swordfishOpen-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (-73.48%)
leaflet heatmap简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-90.15%)
fastdata-clusterFast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-84.85%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-81.06%)
bigdata-funA complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-89.39%)
Gather DeploymentGathers scalable tensorflow and infrastructure deployment
Stars: ✭ 326 (+146.97%)
Data Algorithms Book MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (+618.94%)
CamusMirror of Linkedin's Camus
Stars: ✭ 81 (-38.64%)
AsakusafwAsakusa Framework
Stars: ✭ 114 (-13.64%)
Apiproject[https://www.sofineday.com], golang项目开发脚手架,集成最佳实践(gin+gorm+go-redis+mongo+cors+jwt+json日志库zap(支持日志收集到kafka或mongo)+消息队列kafka+微信支付宝支付gopay+api加密+api反向代理+go modules依赖管理+headless爬虫chromedp+makefile+二进制压缩+livereload热加载)
Stars: ✭ 124 (-6.06%)
Spark LucenerddSpark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (-13.64%)
Spark Infotheoretic Feature SelectionThis package contains a generic implementation of greedy Information Theoretic Feature Selection (FS) methods. The implementation is based on the common theoretic framework presented by Gavin Brown. Implementations of mRMR, InfoGain, JMI and other commonly used FS filters are provided.
Stars: ✭ 123 (-6.82%)
Spring Shiro SparkSpring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试
Stars: ✭ 114 (-13.64%)
Mmo ServerDistributed Java game server, including login, gateway, game demo
Stars: ✭ 114 (-13.64%)
DynamometerA tool for scale and performance testing of HDFS with a specific focus on the NameNode.
Stars: ✭ 122 (-7.58%)
Parquet GoGo package to read and write parquet files. parquet is a file format to store nested data structures in a flat columnar data format. It can be used in the Hadoop ecosystem and with tools such as Presto and AWS Athena.
Stars: ✭ 114 (-13.64%)
SpydraEphemeral Hadoop clusters using Google Compute Platform
Stars: ✭ 128 (-3.03%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-7.58%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (-15.15%)
TunnelPG数据同步工具(Java实现)
Stars: ✭ 122 (-7.58%)
Ultimate GoThis repo contains my notes on working with Go and computer systems.
Stars: ✭ 1,530 (+1059.09%)
Kkbinlog支持mysql、MongoDB数据变更订阅分发
Stars: ✭ 112 (-15.15%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (-3.03%)
Php RdkafkaProduction-ready, stable Kafka client for PHP
Stars: ✭ 1,703 (+1190.15%)
ArchivesparkAn Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
Stars: ✭ 111 (-15.91%)
ElephasDistributed Deep learning with Keras & Spark
Stars: ✭ 1,521 (+1052.27%)
SlimmessagebusLightweight message bus interface for .NET (pub/sub and request-response) with transport plugins for popular message brokers.
Stars: ✭ 120 (-9.09%)
MythReliable messages resolve distributed transactions
Stars: ✭ 1,470 (+1013.64%)
Lambda ArchApplying Lambda Architecture with Spark, Kafka, and Cassandra.
Stars: ✭ 111 (-15.91%)
Metronome Metronome is a distributed and fault-tolerant event scheduler
Stars: ✭ 131 (-0.76%)
OpaqueAn encrypted data analytics platform
Stars: ✭ 129 (-2.27%)
OpenubaA robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (-3.79%)
Kafka Zk RestapiKafka Zookeeper RESTful API to perform topic/consumer group administration/metric(offset\lag\message) collection and monitor
Stars: ✭ 121 (-8.33%)