learning-hadoop-and-sparkCompanion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+26.96%)
Just Dashboard📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+1213.91%)
SylphStream computing platform for bigdata
Stars: ✭ 362 (+214.78%)
VespaThe open big data serving engine. https://vespa.ai
Stars: ✭ 3,747 (+3158.26%)
CalciteApache Calcite
Stars: ✭ 2,816 (+2348.7%)
meetups-archivosPpts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …
Stars: ✭ 60 (-47.83%)
Couchdb DockerSemi-official Apache CouchDB Docker images
Stars: ✭ 194 (+68.7%)
Grouparoo🦘 The Grouparoo Monorepo - open source customer data sync framework
Stars: ✭ 334 (+190.43%)
automile-phpAutomile offers a simple, smart, cutting-edge telematics solution for businesses to track and manage their business vehicles.
Stars: ✭ 28 (-75.65%)
TezApache Tez
Stars: ✭ 313 (+172.17%)
dlsaDistributed least squares approximation (dlsa) implemented with Apache Spark
Stars: ✭ 25 (-78.26%)
Pythondatarepo for code published on pythondata.com
Stars: ✭ 113 (-1.74%)
FluidFluid, elastic data abstraction and acceleration for BigData/AI applications in cloud
Stars: ✭ 265 (+130.43%)
Presto Go ClientA Presto client for the Go programming language.
Stars: ✭ 183 (+59.13%)
nebulaA distributed block-based data storage and compute engine
Stars: ✭ 127 (+10.43%)
AmbariMirror of Apache Ambari
Stars: ✭ 1,576 (+1270.43%)
lubeckHigh level linear algebra library for Dlang
Stars: ✭ 57 (-50.43%)
SmooksAn extensible Java framework for building XML and non-XML streaming applications
Stars: ✭ 293 (+154.78%)
FlinkApache Flink is an open source project of The Apache Software Foundation (ASF).
The Apache Flink project originated from the Stratosphere research project.
Stars: ✭ 17,781 (+15361.74%)
KeyviKeyvi - a key value index that powers Cliqz search engine. It is an in-memory FST-based data structure highly optimized for size and lookup performance.
Stars: ✭ 171 (+48.7%)
TrinoOfficial repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+3883.48%)
ngmswissgeol.ch gives you insight in geoscientific data - above and below the surface.
Stars: ✭ 23 (-80%)
GeopysparkGeoTrellis for PySpark
Stars: ✭ 167 (+45.22%)
DatahubThe Metadata Platform for the Modern Data Stack
Stars: ✭ 4,232 (+3580%)
wranglerWrangler Transform: A DMD system for transforming Big Data
Stars: ✭ 63 (-45.22%)
GenieDistributed Big Data Orchestration Service
Stars: ✭ 1,544 (+1242.61%)
FluoApache Fluo
Stars: ✭ 159 (+38.26%)
bigstatsrR package for statistical tools with big matrices stored on disk.
Stars: ✭ 139 (+20.87%)
automile-netAutomile offers a simple, smart, cutting-edge telematics solution for businesses to track and manage their business vehicles.
Stars: ✭ 24 (-79.13%)
BigdataclassTwo-day workshop that covers how to use R to interact databases and Spark
Stars: ✭ 110 (-4.35%)
GeniA Clojure dataframe library that runs on Spark
Stars: ✭ 152 (+32.17%)
bigdata-funA complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-87.83%)
sparkApache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
Stars: ✭ 609 (+429.57%)
beekeeperService for automatically managing and cleaning up unreferenced data
Stars: ✭ 43 (-62.61%)
Spark R Notebooks R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 109 (-5.22%)
DatasciencevmTools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)
Stars: ✭ 153 (+33.04%)
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-86.09%)
ibmpairsopen source tools for interaction with IBM PAIRS:
Stars: ✭ 23 (-80%)
spark-transformersSpark-Transformers: Library for exporting Apache Spark MLLIB models to use them in any Java application with no other dependencies.
Stars: ✭ 39 (-66.09%)
100daysofmlcodeMy journey to learn and grow in the domain of Machine Learning and Artificial Intelligence by performing the #100DaysofMLCode Challenge.
Stars: ✭ 146 (+26.96%)
Tennis Crystal BallUltimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction
Stars: ✭ 107 (-6.96%)
merkle-dbHigh-scalability analytics database built on immutable merkle-trees
Stars: ✭ 44 (-61.74%)
MahaA framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.
Stars: ✭ 101 (-12.17%)
VizukaExplore high-dimensional datasets and how your algo handles specific regions.
Stars: ✭ 100 (-13.04%)
Graph samplingGraph Sampling is a python package containing various approaches which samples the original graph according to different sample sizes.
Stars: ✭ 99 (-13.91%)
metriqlThe metrics layer for your data. Join us at https://metriql.com/slack
Stars: ✭ 227 (+97.39%)
MetamodelMirror of Apache Metamodel
Stars: ✭ 143 (+24.35%)