Data AcceleratorData Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (+850%)
Mutual labels: big-data, streaming-data
Onlinestats.jlSingle-pass algorithms for statistics
Stars: ✭ 507 (+1850%)
Mutual labels: big-data, streaming-data
nebulaA distributed block-based data storage and compute engine
Stars: ✭ 127 (+388.46%)
Mutual labels: big-data
ByteSlice"Byteslice: Pushing the envelop of main memory data processing with a new storage layout" (SIGMOD'15)
Stars: ✭ 24 (-7.69%)
Mutual labels: big-data
xcastA High-Performance Data Science Toolkit for the Earth Sciences
Stars: ✭ 28 (+7.69%)
Mutual labels: big-data
cloudberryBig Data Visualization
Stars: ✭ 89 (+242.31%)
Mutual labels: big-data
awesome-bigdataA curated list of awesome big data frameworks, ressources and other awesomeness.
Stars: ✭ 11,093 (+42565.38%)
Mutual labels: streaming-data
sparkucxA high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (+23.08%)
Mutual labels: big-data
godsendA simple and eloquent workflow for streaming messages to micro-services.
Stars: ✭ 15 (-42.31%)
Mutual labels: streaming-data
bigquery-kafka-connect☁️ nodejs kafka connect connector for Google BigQuery
Stars: ✭ 17 (-34.62%)
Mutual labels: big-data
Big-Data-Demo基于Vue、three.js、echarts,数据可视化展示项目,包含三维模型导入交互、三维模型标注等功能
Stars: ✭ 146 (+461.54%)
Mutual labels: big-data
arrow-datafusionApache Arrow DataFusion SQL Query Engine
Stars: ✭ 2,360 (+8976.92%)
Mutual labels: big-data
insightedgeInsightEdge Core
Stars: ✭ 22 (-15.38%)
Mutual labels: big-data
talariaTalariaDB is a distributed, highly available, and low latency time-series database for Presto
Stars: ✭ 148 (+469.23%)
Mutual labels: big-data
incubator-liminalApache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.
Stars: ✭ 117 (+350%)
Mutual labels: big-data
MLBDMaterials for "Machine Learning on Big Data" course
Stars: ✭ 20 (-23.08%)
Mutual labels: big-data
beekeeperService for automatically managing and cleaning up unreferenced data
Stars: ✭ 43 (+65.38%)
Mutual labels: big-data
LoL-Match-PredictionWin probability predictions for League of Legends matches using neural networks
Stars: ✭ 34 (+30.77%)
Mutual labels: big-data
meetups-archivosPpts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …
Stars: ✭ 60 (+130.77%)
Mutual labels: big-data
Twitter-Stream-API-DatasetTwitter Dynamic Dataset Api. Create any dataset YOU want.
Stars: ✭ 20 (-23.08%)
Mutual labels: streaming-data