All Projects → Bigtop → Similar Projects or Alternatives

369 Open source projects that are alternatives of or similar to Bigtop

Cloudbreak
A tool for provisioning and managing Apache Hadoop clusters in the cloud. Cloudbreak, as part of the Hortonworks Data Platform, makes it easy to provision, configure and elastically grow HDP clusters on cloud infrastructure. Cloudbreak can be used to provision Hadoop across cloud infrastructure providers including AWS, Azure, GCP and OpenStack.
Stars: ✭ 301 (-15.45%)
Mutual labels:  big-data
check-engine
Data validation library for PySpark 3.0.0
Stars: ✭ 29 (-91.85%)
Mutual labels:  big-data
pipeline
OONI data processing pipeline
Stars: ✭ 36 (-89.89%)
Mutual labels:  big-data
classifai
🔥 One of the most comprehensive open-source data annotation platform.
Stars: ✭ 99 (-72.19%)
Mutual labels:  big-data
Beeva Best Practices
Best Practices and Style Guides in BEEVA
Stars: ✭ 335 (-5.9%)
Mutual labels:  big-data
storm-ml
an online learning algorithm library for Storm
Stars: ✭ 18 (-94.94%)
Mutual labels:  big-data
NiFi-Rule-engine-processor
Drools processor for Apache NiFi
Stars: ✭ 34 (-90.45%)
Mutual labels:  big-data
OnlineStatsBase.jl
Base types for OnlineStats.
Stars: ✭ 26 (-92.7%)
Mutual labels:  big-data
Baize
白泽自动化运维系统:配置管理、网络探测、资产管理、业务管理、CMDB、CD、DevOps、作业编排、任务编排等功能,未来将添加监控、报警、日志分析、大数据分析等部分内容
Stars: ✭ 296 (-16.85%)
Mutual labels:  big-data
ByteSlice
"Byteslice: Pushing the envelop of main memory data processing with a new storage layout" (SIGMOD'15)
Stars: ✭ 24 (-93.26%)
Mutual labels:  big-data
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-96.35%)
Mutual labels:  big-data
Movies-Analytics-in-Spark-and-Scala
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Stars: ✭ 47 (-86.8%)
Mutual labels:  big-data
Stroom
Stroom is a highly scalable data storage, processing and analysis platform.
Stars: ✭ 344 (-3.37%)
Mutual labels:  big-data
meetups-archivos
Ppts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …
Stars: ✭ 60 (-83.15%)
Mutual labels:  big-data
lens
Mirror of Apache Lens
Stars: ✭ 57 (-83.99%)
Mutual labels:  big-data
bigquery-kafka-connect
☁️ nodejs kafka connect connector for Google BigQuery
Stars: ✭ 17 (-95.22%)
Mutual labels:  big-data
Crate
CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of data in real-time.
Stars: ✭ 3,254 (+814.04%)
Mutual labels:  big-data
LoL-Match-Prediction
Win probability predictions for League of Legends matches using neural networks
Stars: ✭ 34 (-90.45%)
Mutual labels:  big-data
AverageShiftedHistograms.jl
⚡ Lightning fast density estimation in Julia ⚡
Stars: ✭ 52 (-85.39%)
Mutual labels:  big-data
insightedge
InsightEdge Core
Stars: ✭ 22 (-93.82%)
Mutual labels:  big-data
Uproot3
ROOT I/O in pure Python and NumPy.
Stars: ✭ 312 (-12.36%)
Mutual labels:  big-data
incubator-liminal
Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.
Stars: ✭ 117 (-67.13%)
Mutual labels:  big-data
hadoop-data-ingestion-tool
OLAP and ETL of Big Data
Stars: ✭ 17 (-95.22%)
Mutual labels:  big-data
beekeeper
Service for automatically managing and cleaning up unreferenced data
Stars: ✭ 43 (-87.92%)
Mutual labels:  big-data
Oie Resources
A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Stars: ✭ 283 (-20.51%)
Mutual labels:  big-data
siembol
An open-source, real-time Security Information & Event Management tool based on big data technologies, providing a scalable, advanced security analytics framework.
Stars: ✭ 153 (-57.02%)
Mutual labels:  big-data
alluxio-py
Alluxio Python client - Access Any Data Source with Python
Stars: ✭ 18 (-94.94%)
Mutual labels:  big-data
airavata-php-gateway
Mirror of Apache Airavata PHP Gateway
Stars: ✭ 15 (-95.79%)
Mutual labels:  big-data
Devops Roadmap
DevOps methodology & roadmap for a devops developer in 2019. Interesting books to learn new technologies.
Stars: ✭ 349 (-1.97%)
Mutual labels:  big-data
CS Book
🔥 Latest computer science e-books。提供最新技术类电子书下载, “我无非就是想卷死各位,或者被各位卷死!”
Stars: ✭ 40 (-88.76%)
Mutual labels:  big-data
predictionio-template-similar-product
PredictionIO Similar Product Engine Template (Scala-based parallelized engine)
Stars: ✭ 50 (-85.96%)
Mutual labels:  big-data
spark-records
Bulletproof Apache Spark jobs with fast root cause analysis of failures.
Stars: ✭ 67 (-81.18%)
Mutual labels:  big-data
Knowage Server
Knowage is the professional open source suite for modern business analytics over traditional sources and big data systems.
Stars: ✭ 276 (-22.47%)
Mutual labels:  big-data
RemoteShuffleService
Celeborn provides an elastic and high-performance service for shuffle and spilled data.
Stars: ✭ 262 (-26.4%)
Mutual labels:  big-data
pypar
Efficient and scalable parallelism using the message passing interface (MPI) to handle big data and highly computational problems.
Stars: ✭ 66 (-81.46%)
Mutual labels:  big-data
terraform-aws-kinesis-firehose
This code creates a Kinesis Firehose in AWS to send CloudWatch log data to S3.
Stars: ✭ 25 (-92.98%)
Mutual labels:  big-data
Mist
Serverless proxy for Spark cluster
Stars: ✭ 309 (-13.2%)
Mutual labels:  big-data
dxram
A distributed in-memory key-value storage for billions of small objects.
Stars: ✭ 25 (-92.98%)
Mutual labels:  big-data
pytorch kmeans
Implementation of the k-means algorithm in PyTorch that works for large datasets
Stars: ✭ 38 (-89.33%)
Mutual labels:  big-data
img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Stars: ✭ 1,173 (+229.49%)
Mutual labels:  big-data
Attic Predictionio Sdk Php
PredictionIO PHP SDK
Stars: ✭ 272 (-23.6%)
Mutual labels:  big-data
GDLibrary
Matlab library for gradient descent algorithms: Version 1.0.1
Stars: ✭ 50 (-85.96%)
Mutual labels:  big-data
pyspark-cheatsheet
PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Stars: ✭ 115 (-67.7%)
Mutual labels:  big-data
lcbo-api
A crawler and API server for Liquor Control Board of Ontario retail data
Stars: ✭ 152 (-57.3%)
Mutual labels:  big-data
Ozone
Scalable, redundant, and distributed object store for Apache Hadoop
Stars: ✭ 330 (-7.3%)
Mutual labels:  big-data
gan deeplearning4j
Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-94.66%)
Mutual labels:  big-data
hyper-engine
Python library for Bayesian hyper-parameters optimization
Stars: ✭ 80 (-77.53%)
Mutual labels:  big-data
FlameStream
Distributed stream processing model and its implementation
Stars: ✭ 14 (-96.07%)
Mutual labels:  big-data
Succinct
Enabling queries on compressed data.
Stars: ✭ 257 (-27.81%)
Mutual labels:  big-data
ngm
swissgeol.ch gives you insight in geoscientific data - above and below the surface.
Stars: ✭ 23 (-93.54%)
Mutual labels:  big-data
big-data-lite
Samples to the Oracle Big Data Lite VM
Stars: ✭ 41 (-88.48%)
Mutual labels:  big-data
automile-net
Automile offers a simple, smart, cutting-edge telematics solution for businesses to track and manage their business vehicles.
Stars: ✭ 24 (-93.26%)
Mutual labels:  big-data
Helix
Mirror of Apache Helix
Stars: ✭ 304 (-14.61%)
Mutual labels:  big-data
falcon
Mirror of Apache Falcon
Stars: ✭ 95 (-73.31%)
Mutual labels:  big-data
Vespa
The open big data serving engine. https://vespa.ai
Stars: ✭ 3,747 (+952.53%)
Mutual labels:  big-data
Attic Apex Core
Mirror of Apache Apex core
Stars: ✭ 346 (-2.81%)
Mutual labels:  big-data
Grouparoo
🦘 The Grouparoo Monorepo - open source customer data sync framework
Stars: ✭ 334 (-6.18%)
Mutual labels:  big-data
Morpheus
Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
Stars: ✭ 303 (-14.89%)
Mutual labels:  big-data
mmtf-workshop-2018
Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (-85.96%)
Mutual labels:  big-data
couchdb-mango
Mirror of Apache CouchDB Mango
Stars: ✭ 34 (-90.45%)
Mutual labels:  big-data
61-120 of 369 similar projects