A tool for provisioning and managing Apache Hadoop clusters in the cloud. Cloudbreak, as part of the Hortonworks Data Platform, makes it easy to provision, configure and elastically grow HDP clusters on cloud infrastructure. Cloudbreak can be used to provision Hadoop across cloud infrastructure providers including AWS, Azure, GCP and OpenStack.

Stars: ✭ 301 (+133.33%)

Mutual labels: big-data

Couchdb Docker

Semi-official Apache CouchDB Docker images

Stars: ✭ 194 (+50.39%)

Mutual labels: big-data

Ambari

Mirror of Apache Ambari

Stars: ✭ 1,576 (+1121.71%)

Mutual labels: big-data

Attic Predictionio Sdk Ruby

PredictionIO Ruby SDK

Stars: ✭ 192 (+48.84%)

Mutual labels: big-data

Baize

白泽自动化运维系统：配置管理、网络探测、资产管理、业务管理、CMDB、CD、DevOps、作业编排、任务编排等功能,未来将添加监控、报警、日志分析、大数据分析等部分内容

Stars: ✭ 296 (+129.46%)

Mutual labels: big-data

Presto Go Client

A Presto client for the Go programming language.

Stars: ✭ 183 (+41.86%)

Mutual labels: big-data

Skymap

High-throughput gene to knowledge mapping through massive integration of public sequencing data.

Stars: ✭ 29 (-77.52%)

Mutual labels: big-data

Bigdata Playground

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Stars: ✭ 177 (+37.21%)

Mutual labels: big-data

Crate

CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of data in real-time.

Stars: ✭ 3,254 (+2422.48%)

Mutual labels: big-data

Keyvi

Keyvi - a key value index that powers Cliqz search engine. It is an in-memory FST-based data structure highly optimized for size and lookup performance.

Stars: ✭ 171 (+32.56%)

Mutual labels: big-data

Parquet Mr

Apache Parquet

Stars: ✭ 1,278 (+890.7%)

Mutual labels: big-data

Geopyspark

GeoTrellis for PySpark

Stars: ✭ 167 (+29.46%)

Mutual labels: big-data

Oie Resources

A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.

Stars: ✭ 283 (+119.38%)

Mutual labels: big-data

Fluo

Apache Fluo

Stars: ✭ 159 (+23.26%)

Mutual labels: big-data

Awesome Scalability

The Patterns of Scalable, Reliable, and Performant Large-Scale Systems

Stars: ✭ 36,688 (+28340.31%)

Mutual labels: big-data

Geni

A Clojure dataframe library that runs on Spark

Stars: ✭ 152 (+17.83%)

Mutual labels: big-data

Knowage Server

Knowage is the professional open source suite for modern business analytics over traditional sources and big data systems.

Stars: ✭ 276 (+113.95%)

Mutual labels: big-data

Datasciencevm

Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)

Stars: ✭ 153 (+18.6%)

Mutual labels: big-data

Report

自动化配置报表平台。演示地址http://58.87.112.247/report 账号 visitor密码123456

Stars: ✭ 123 (-4.65%)

Mutual labels: big-data

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (+16.28%)

Mutual labels: big-data

Attic Predictionio Sdk Php

PredictionIO PHP SDK

Stars: ✭ 272 (+110.85%)

Mutual labels: big-data

100daysofmlcode

My journey to learn and grow in the domain of Machine Learning and Artificial Intelligence by performing the #100DaysofMLCode Challenge.

Stars: ✭ 146 (+13.18%)

Mutual labels: big-data

K8s Ingress Claim

An admission control policy that safeguards against accidental duplicate claiming of Hosts/Domains.

Stars: ✭ 14 (-89.15%)

Mutual labels: big-data

Metamodel

Mirror of Apache Metamodel

Stars: ✭ 143 (+10.85%)

Mutual labels: big-data

Succinct

Enabling queries on compressed data.

Stars: ✭ 257 (+99.22%)

Mutual labels: big-data

Big Data Study

🐳 big data study

Stars: ✭ 141 (+9.3%)

Mutual labels: big-data

Panoptes

A Global Scale Network Telemetry Ecosystem

Stars: ✭ 80 (-37.98%)

Mutual labels: big-data

bandar-log

Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.

Stars: ✭ 20 (-84.5%)

Mutual labels: big-data

Gaffer

A large-scale entity and relation database supporting aggregation of properties

Stars: ✭ 1,642 (+1172.87%)

Mutual labels: big-data

Tajo

Mirror of Apache Tajo

Stars: ✭ 128 (-0.78%)

Mutual labels: big-data

Richdem

High-performance Terrain and Hydrology Analysis

Stars: ✭ 127 (-1.55%)

Mutual labels: big-data

Cmak

CMAK is a tool for managing Apache Kafka clusters

Stars: ✭ 10,544 (+8073.64%)

Mutual labels: big-data

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+8420.16%)

Mutual labels: big-data

Warp

Convert and analyze large data sets at light speed, on Mac and iOS.

Stars: ✭ 62 (-51.94%)

Mutual labels: big-data

Fit Sne

Fast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE)

Stars: ✭ 485 (+275.97%)

Mutual labels: big-data

nebula

A distributed, fast open-source graph database featuring horizontal scalability and high availability

Stars: ✭ 8,196 (+6253.49%)

Mutual labels: big-data

301-360 of 369 similar projects

first

‹

›