IotdbApache IoTDB
Stars: ✭ 1,221 (-24.58%)
MorpheusMorpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
Stars: ✭ 303 (-81.28%)
Esteem SurferEcency desktop formerly known as Esteem Surfer - reimagined desktop social wallet, contribute and get rewarded (for Windows, Mac, Linux)
Stars: ✭ 100 (-93.82%)
ElasticlusterCreate clusters of VMs on the cloud and configure them with Ansible.
Stars: ✭ 298 (-81.59%)
Data Algorithms Book MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (-41.38%)
Baize白泽自动化运维系统:配置管理、网络探测、资产管理、业务管理、CMDB、CD、DevOps、作业编排、任务编排等功能,未来将添加监控、报警、日志分析、大数据分析等部分内容
Stars: ✭ 296 (-81.72%)
QcportalA client interface to the QCArchive Project (read-only image of QCFractal)
Stars: ✭ 29 (-98.21%)
CmakCMAK is a tool for managing Apache Kafka clusters
Stars: ✭ 10,544 (+551.27%)
BehemothBehemoth is an open source platform for large scale document analysis based on Apache Hadoop.
Stars: ✭ 286 (-82.33%)
Awesome ScalabilityThe Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Stars: ✭ 36,688 (+2166.09%)
PorsasExperimental stuff for going fast with Clojure + JDBC & Async SQL
Stars: ✭ 78 (-95.18%)
ZeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+240.52%)
Cloud VolumeRead and write Neuroglancer datasets programmatically.
Stars: ✭ 63 (-96.11%)
Hibernate SpringbootCollection of best practices for Java persistence performance in Spring Boot applications
Stars: ✭ 589 (-63.62%)
RatatoolA tool for data sampling, data generation, and data diffing
Stars: ✭ 279 (-82.77%)
Springboot Templatesspringboot和dubbo、netty的集成,redis mongodb的nosql模板, kafka rocketmq rabbit的MQ模板, solr solrcloud elasticsearch查询引擎
Stars: ✭ 100 (-93.82%)
DatahubThe Metadata Platform for the Modern Data Stack
Stars: ✭ 4,232 (+161.4%)
Tf YarnTrain TensorFlow models on YARN in just a few lines of code!
Stars: ✭ 76 (-95.31%)
RoapiCreate full-fledged APIs for static datasets without writing a single line of code.
Stars: ✭ 253 (-84.37%)
Dremio OssDremio - the missing link in modern data
Stars: ✭ 862 (-46.76%)
BigdataclassTwo-day workshop that covers how to use R to interact databases and Spark
Stars: ✭ 110 (-93.21%)
Tennis Crystal BallUltimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction
Stars: ✭ 107 (-93.39%)
JailerDatabase Subsetting and Relational Data Browsing Tool.
Stars: ✭ 576 (-64.42%)
AlluxioAlluxio, data orchestration for analytics and machine learning in the cloud
Stars: ✭ 5,379 (+232.24%)
H2databaseH2 is an embeddable RDBMS written in Java.
Stars: ✭ 3,078 (+90.12%)
Bigdata Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (-47.07%)
bigstatsrR package for statistical tools with big matrices stored on disk.
Stars: ✭ 139 (-91.41%)
CookbookThe Data Engineering Cookbook
Stars: ✭ 9,829 (+507.1%)
masonREST APIs with JSP tags, SQL and much more.
Stars: ✭ 24 (-98.52%)
Hazelcast JetDistributed Stream and Batch Processing
Stars: ✭ 855 (-47.19%)
pulsephData Pulse application log aggregation and monitoring
Stars: ✭ 13 (-99.2%)
AutodlAutomated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]
Stars: ✭ 854 (-47.25%)
hadoop-docker-liteDocker build project to setup a lightweight hadoop cluster containing hadoop, pig, zookeeper, hbase, phoenix, storm, kafka, kafka manager
Stars: ✭ 24 (-98.52%)
StackMob-3A plugin designed for bukkit servers, aiming to reduce the lag that both the server and players experience.
Stars: ✭ 23 (-98.58%)
Hadoop PotA scalable Apache Hadoop-based implementation of the Pooled Time Series video similarity algorithm based on M. Ryoo et al paper CVPR 2015.
Stars: ✭ 8 (-99.51%)
Just Dashboard📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (-6.67%)
avro-schema-generatorLibrary for generating avro schema files (.avsc) based on DB tables structure
Stars: ✭ 38 (-97.65%)
DatabookA facebook for data
Stars: ✭ 26 (-98.39%)
dbddbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
Stars: ✭ 30 (-98.15%)
LabsResearch on distributed system
Stars: ✭ 73 (-95.49%)
PgjdbcPostgresql JDBC Driver
Stars: ✭ 925 (-42.87%)
GiraphMirror of Apache Giraph
Stars: ✭ 569 (-64.85%)
WarpConvert and analyze large data sets at light speed, on Mac and iOS.
Stars: ✭ 62 (-96.17%)
ScannerEfficient video analysis at scale
Stars: ✭ 569 (-64.85%)
Hadoop study定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Stars: ✭ 567 (-64.98%)
NabhashAn extremely fast Non-crypto-safe AES Based Hash algorithm for Big Data
Stars: ✭ 62 (-96.17%)
PachydermReproducible Data Science at Scale!
Stars: ✭ 5,305 (+227.67%)
NipypeWorkflows and interfaces for neuroimaging packages
Stars: ✭ 557 (-65.6%)
WaimakWaimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Stars: ✭ 60 (-96.29%)
CouchdbSeamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
Stars: ✭ 5,166 (+219.09%)
ThrillThrill - An EXPERIMENTAL Algorithmic Distributed Big Data Batch Processing Framework in C++
Stars: ✭ 528 (-67.39%)
Pythondatarepo for code published on pythondata.com
Stars: ✭ 113 (-93.02%)