predictionioPredictionIO, a machine learning server for developers and ML engineers.
Stars: ✭ 12,510 (+4550.56%)
Attic PredictionioPredictionIO, a machine learning server for developers and ML engineers.
Stars: ✭ 12,522 (+4555.02%)
CS Book🔥 Latest computer science e-books。提供最新技术类电子书下载, “我无非就是想卷死各位,或者被各位卷死!”
Stars: ✭ 40 (-85.13%)
classifai🔥 One of the most comprehensive open-source data annotation platform.
Stars: ✭ 99 (-63.2%)
spark-recordsBulletproof Apache Spark jobs with fast root cause analysis of failures.
Stars: ✭ 67 (-75.09%)
meetups-archivosPpts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …
Stars: ✭ 60 (-77.7%)
RemoteShuffleServiceCeleborn provides an elastic and high-performance service for shuffle and spilled data.
Stars: ✭ 262 (-2.6%)
dxramA distributed in-memory key-value storage for billions of small objects.
Stars: ✭ 25 (-90.71%)
img2datasetEasily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Stars: ✭ 1,173 (+336.06%)
storm-mlan online learning algorithm library for Storm
Stars: ✭ 18 (-93.31%)
LoL-Match-PredictionWin probability predictions for League of Legends matches using neural networks
Stars: ✭ 34 (-87.36%)
GDLibraryMatlab library for gradient descent algorithms: Version 1.0.1
Stars: ✭ 50 (-81.41%)
lcbo-apiA crawler and API server for Liquor Control Board of Ontario retail data
Stars: ✭ 152 (-43.49%)
rastercuberastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)
Stars: ✭ 15 (-94.42%)
Big-Data-Demo基于Vue、three.js、echarts,数据可视化展示项目,包含三维模型导入交互、三维模型标注等功能
Stars: ✭ 146 (-45.72%)
opendcCollaborative Datacenter Simulation and Exploration for Everybody
Stars: ✭ 40 (-85.13%)
beam-siteApache Beam Site
Stars: ✭ 28 (-89.59%)
talariaTalariaDB is a distributed, highly available, and low latency time-series database for Presto
Stars: ✭ 148 (-44.98%)
scarfToolkit for highly memory efficient analysis of single-cell RNA-Seq, scATAC-Seq and CITE-Seq data. Analyze atlas scale datasets with millions of cells on laptop.
Stars: ✭ 54 (-79.93%)
IoT-system-PLC-data-to-InfluxDBThis project aim is to provide free software to fetch data from plcs (Siemens S7-300/400/1200/1500) and store it. Used stack is completly opensource. I used InfluDB as data storage, so application principle is following Big Data paradigm.
Stars: ✭ 26 (-90.33%)
xcastA High-Performance Data Science Toolkit for the Earth Sciences
Stars: ✭ 28 (-89.59%)
spark-rootApache Spark Data Source for ROOT File Format
Stars: ✭ 28 (-89.59%)
subsemblesubsemble R package for ensemble learning on subsets of data
Stars: ✭ 40 (-85.13%)
nebulaA distributed, fast open-source graph database featuring horizontal scalability and high availability
Stars: ✭ 8,196 (+2946.84%)
arrow-datafusionApache Arrow DataFusion SQL Query Engine
Stars: ✭ 2,360 (+777.32%)
couchdb-mangoMirror of Apache CouchDB Mango
Stars: ✭ 34 (-87.36%)
gan deeplearning4jAutomatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-92.94%)
SGDLibraryMATLAB/Octave library for stochastic optimization algorithms: Version 1.0.20
Stars: ✭ 165 (-38.66%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-85.5%)
SynapseMLSimple and Distributed Machine Learning
Stars: ✭ 3,355 (+1147.21%)
automile-phpAutomile offers a simple, smart, cutting-edge telematics solution for businesses to track and manage their business vehicles.
Stars: ✭ 28 (-89.59%)
cloudberryBig Data Visualization
Stars: ✭ 89 (-66.91%)
FlameStreamDistributed stream processing model and its implementation
Stars: ✭ 14 (-94.8%)
lubeckHigh level linear algebra library for Dlang
Stars: ✭ 57 (-78.81%)
clusterdockclusterdock is a framework for creating Docker-based container clusters
Stars: ✭ 26 (-90.33%)
incubator-liminalApache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.
Stars: ✭ 117 (-56.51%)
ngmswissgeol.ch gives you insight in geoscientific data - above and below the surface.
Stars: ✭ 23 (-91.45%)
nifiDeploy a secured, clustered, auto-scaling NiFi service in AWS.
Stars: ✭ 37 (-86.25%)
nebulaA distributed block-based data storage and compute engine
Stars: ✭ 127 (-52.79%)
automile-netAutomile offers a simple, smart, cutting-edge telematics solution for businesses to track and manage their business vehicles.
Stars: ✭ 24 (-91.08%)
big-data-upfRECSM-UPF Summer School: Social Media and Big Data Research
Stars: ✭ 21 (-92.19%)
MLBDMaterials for "Machine Learning on Big Data" course
Stars: ✭ 20 (-92.57%)