All Projects → MLBD → Similar Projects or Alternatives

421 Open source projects that are alternatives of or similar to MLBD

big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (+70%)
Mutual labels:  big-data, mapreduce
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (+255%)
Mutual labels:  big-data, mapreduce
Bigdata Notes
大数据入门指南 ⭐
Stars: ✭ 10,991 (+54855%)
Mutual labels:  big-data, mapreduce
pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (+260%)
Mutual labels:  big-data, mapreduce
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+110140%)
Mutual labels:  big-data, mapreduce
Asakusafw
Asakusa Framework
Stars: ✭ 114 (+470%)
Mutual labels:  big-data, mapreduce
HadoopDedup
🍉基于Hadoop和HBase的大规模海量数据去重
Stars: ✭ 27 (+35%)
Mutual labels:  big-data, mapreduce
Real Time Social Media Mining
DevOps pipeline for Real Time Social/Web Mining
Stars: ✭ 22 (+10%)
Mutual labels:  big-data
siembol
An open-source, real-time Security Information & Event Management tool based on big data technologies, providing a scalable, advanced security analytics framework.
Stars: ✭ 153 (+665%)
Mutual labels:  big-data
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+95%)
Mutual labels:  big-data
FlameStream
Distributed stream processing model and its implementation
Stars: ✭ 14 (-30%)
Mutual labels:  big-data
nebula
A distributed, fast open-source graph database featuring horizontal scalability and high availability
Stars: ✭ 8,196 (+40880%)
Mutual labels:  big-data
beekeeper
Service for automatically managing and cleaning up unreferenced data
Stars: ✭ 43 (+115%)
Mutual labels:  big-data
airavata-django-portal
Mirror of Apache Airavata Django Portal
Stars: ✭ 20 (+0%)
Mutual labels:  big-data
LoL-Match-Prediction
Win probability predictions for League of Legends matches using neural networks
Stars: ✭ 34 (+70%)
Mutual labels:  big-data
qs-hadoop
大数据生态圈学习
Stars: ✭ 18 (-10%)
Mutual labels:  mapreduce
airavata-php-gateway
Mirror of Apache Airavata PHP Gateway
Stars: ✭ 15 (-25%)
Mutual labels:  big-data
meetups-archivos
Ppts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …
Stars: ✭ 60 (+200%)
Mutual labels:  big-data
Data-pipeline-project
Data pipeline project
Stars: ✭ 18 (-10%)
Mutual labels:  mapreduce
insightedge
InsightEdge Core
Stars: ✭ 22 (+10%)
Mutual labels:  big-data
CS Book
🔥 Latest computer science e-books。提供最新技术类电子书下载, “我无非就是想卷死各位,或者被各位卷死!”
Stars: ✭ 40 (+100%)
Mutual labels:  big-data
nifi
Deploy a secured, clustered, auto-scaling NiFi service in AWS.
Stars: ✭ 37 (+85%)
Mutual labels:  big-data
automile-net
Automile offers a simple, smart, cutting-edge telematics solution for businesses to track and manage their business vehicles.
Stars: ✭ 24 (+20%)
Mutual labels:  big-data
spark-records
Bulletproof Apache Spark jobs with fast root cause analysis of failures.
Stars: ✭ 67 (+235%)
Mutual labels:  big-data
iis
Information Inference Service of the OpenAIRE system
Stars: ✭ 16 (-20%)
Mutual labels:  big-data
FIW KRT
Families In the WIld: A Kinship Recogntion Toolbox.
Stars: ✭ 18 (-10%)
Mutual labels:  big-data
dxram
A distributed in-memory key-value storage for billions of small objects.
Stars: ✭ 25 (+25%)
Mutual labels:  big-data
nebula
A distributed block-based data storage and compute engine
Stars: ✭ 127 (+535%)
Mutual labels:  big-data
img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Stars: ✭ 1,173 (+5765%)
Mutual labels:  big-data
arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
Stars: ✭ 2,360 (+11700%)
Mutual labels:  big-data
GDLibrary
Matlab library for gradient descent algorithms: Version 1.0.1
Stars: ✭ 50 (+150%)
Mutual labels:  big-data
sparkucx
A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (+60%)
Mutual labels:  big-data
lcbo-api
A crawler and API server for Liquor Control Board of Ontario retail data
Stars: ✭ 152 (+660%)
Mutual labels:  big-data
talaria
TalariaDB is a distributed, highly available, and low latency time-series database for Presto
Stars: ✭ 148 (+640%)
Mutual labels:  big-data
gan deeplearning4j
Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-5%)
Mutual labels:  big-data
rastercube
rastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)
Stars: ✭ 15 (-25%)
Mutual labels:  big-data
automile-php
Automile offers a simple, smart, cutting-edge telematics solution for businesses to track and manage their business vehicles.
Stars: ✭ 28 (+40%)
Mutual labels:  big-data
SGDLibrary
MATLAB/Octave library for stochastic optimization algorithms: Version 1.0.20
Stars: ✭ 165 (+725%)
Mutual labels:  big-data
lubeck
High level linear algebra library for Dlang
Stars: ✭ 57 (+185%)
Mutual labels:  big-data
azure-big-data-starter
A boilerplate project for Azure Big Data PaaS services
Stars: ✭ 13 (-35%)
Mutual labels:  big-data
ngm
swissgeol.ch gives you insight in geoscientific data - above and below the surface.
Stars: ✭ 23 (+15%)
Mutual labels:  big-data
Big-Data-Demo
基于Vue、three.js、echarts,数据可视化展示项目,包含三维模型导入交互、三维模型标注等功能
Stars: ✭ 146 (+630%)
Mutual labels:  big-data
rail
Scalable RNA-seq analysis
Stars: ✭ 74 (+270%)
Mutual labels:  mapreduce
beam-site
Apache Beam Site
Stars: ✭ 28 (+40%)
Mutual labels:  big-data
big-data-upf
RECSM-UPF Summer School: Social Media and Big Data Research
Stars: ✭ 21 (+5%)
Mutual labels:  big-data
eventgrad
Event-Triggered Communication in Parallel Machine Learning
Stars: ✭ 14 (-30%)
predictionio-template-ecom-recommender
PredictionIO E-Commerce Recommendation Engine Template (Scala-based parallelized engine)
Stars: ✭ 73 (+265%)
Mutual labels:  big-data
ParallelUtilities.jl
Fast and easy parallel mapreduce on HPC clusters
Stars: ✭ 28 (+40%)
Mutual labels:  mapreduce
xcast
A High-Performance Data Science Toolkit for the Earth Sciences
Stars: ✭ 28 (+40%)
Mutual labels:  big-data
interview-refresh-java-bigdata
a one-stop repo to lookup for code snippets of core java concepts, sql, data structures as well as big data. It also consists of interview questions asked in real-life.
Stars: ✭ 25 (+25%)
Mutual labels:  mapreduce
cloudberry
Big Data Visualization
Stars: ✭ 89 (+345%)
Mutual labels:  big-data
scarf
Toolkit for highly memory efficient analysis of single-cell RNA-Seq, scATAC-Seq and CITE-Seq data. Analyze atlas scale datasets with millions of cells on laptop.
Stars: ✭ 54 (+170%)
Mutual labels:  big-data
predictionio-sdk-java
PredictionIO Java SDK
Stars: ✭ 107 (+435%)
Mutual labels:  big-data
shifting
A privacy-focused list of alternatives to mainstream services to help the competition.
Stars: ✭ 31 (+55%)
Mutual labels:  big-data
RemoteShuffleService
Celeborn provides an elastic and high-performance service for shuffle and spilled data.
Stars: ✭ 262 (+1210%)
Mutual labels:  big-data
mls
CSCE 585 - Machine Learning Systems
Stars: ✭ 36 (+80%)
etran
Erlang Parse Transforms Including Fold (MapReduce) comprehension, Elixir-like Pipeline, and default function arguments
Stars: ✭ 19 (-5%)
Mutual labels:  mapreduce
incubator-liminal
Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.
Stars: ✭ 117 (+485%)
Mutual labels:  big-data
IoT-system-PLC-data-to-InfluxDB
This project aim is to provide free software to fetch data from plcs (Siemens S7-300/400/1200/1500) and store it. Used stack is completly opensource. I used InfluDB as data storage, so application principle is following Big Data paradigm.
Stars: ✭ 26 (+30%)
Mutual labels:  big-data
yildiz
🦄🌟 Graph Database layer on top of Google Bigtable
Stars: ✭ 24 (+20%)
Mutual labels:  big-data
1-60 of 421 similar projects