Flink ShadedApache Flink shaded artifacts repository
Stars: ✭ 67 (-95.75%)
YmcacheYMCache is a lightweight object caching solution for iOS and Mac OS X that is designed for highly parallel access scenarios.
Stars: ✭ 58 (-96.32%)
AppdocsApplication Performance Optimization Summary
Stars: ✭ 1,169 (-25.82%)
TraildbTrailDB is an efficient tool for storing and querying series of events
Stars: ✭ 1,029 (-34.71%)
WarpConvert and analyze large data sets at light speed, on Mac and iOS.
Stars: ✭ 62 (-96.07%)
Macro mlCourse Website on Macroeconomic Analysis with Machine Learning and Big Data
Stars: ✭ 53 (-96.64%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-94.99%)
EgadsA Java package to automatically detect anomalies in large scale time-series data
Stars: ✭ 997 (-36.74%)
TreevizTree diagrams with JavaScript 🌲 📈
Stars: ✭ 95 (-93.97%)
CarbondataMirror of Apache CarbonData
Stars: ✭ 1,158 (-26.52%)
Graph samplingGraph Sampling is a python package containing various approaches which samples the original graph according to different sample sizes.
Stars: ✭ 99 (-93.72%)
Cloud VolumeRead and write Neuroglancer datasets programmatically.
Stars: ✭ 63 (-96%)
VerticapyVerticaPy is a Python library that exposes sci-kit like functionality to conduct data science projects on data stored in Vertica, thus taking advantage Vertica’s speed and built-in analytics and machine learning capabilities.
Stars: ✭ 59 (-96.26%)
Tennis Crystal BallUltimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction
Stars: ✭ 107 (-93.21%)
Kibble 1Apache Kibble - a tool to collect, aggregate and visualize data about any software project
Stars: ✭ 54 (-96.57%)
Uproot4ROOT I/O in pure Python and NumPy.
Stars: ✭ 80 (-94.92%)
Datumbox FrameworkDatumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
Stars: ✭ 1,063 (-32.55%)
OrcAn ORC file format reader and writer for Go.
Stars: ✭ 97 (-93.85%)
BookkeeperApache Bookkeeper
Stars: ✭ 1,178 (-25.25%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (-15.1%)
VizukaExplore high-dimensional datasets and how your algo handles specific regions.
Stars: ✭ 100 (-93.65%)
Countly Sdk CordovaCountly Product Analytics SDK for Cordova, Icenium and Phonegap
Stars: ✭ 69 (-95.62%)
ReefMirror of Apache REEF
Stars: ✭ 92 (-94.16%)
RsparklingRSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-95.88%)
Bitcoin Value Predictor[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Stars: ✭ 91 (-94.23%)
NabhashAn extremely fast Non-crypto-safe AES Based Hash algorithm for Big Data
Stars: ✭ 62 (-96.07%)
Attic LensMirror of Apache Lens
Stars: ✭ 58 (-96.32%)
BigdataclassTwo-day workshop that covers how to use R to interact databases and Spark
Stars: ✭ 110 (-93.02%)
PanoptesA Global Scale Network Telemetry Ecosystem
Stars: ✭ 80 (-94.92%)
Lifion KinesisA native Node.js producer and consumer library for Amazon Kinesis Data Streams
Stars: ✭ 54 (-96.57%)
KuduMirror of Apache Kudu
Stars: ✭ 1,360 (-13.71%)
OodtMirror of Apache OODT
Stars: ✭ 52 (-96.7%)
IotdbApache IoTDB
Stars: ✭ 1,221 (-22.53%)
TrckQuery engine for TrailDB
Stars: ✭ 48 (-96.95%)
MoosefsMooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (-34.96%)
AttacaRobust, distributed version control for large files.
Stars: ✭ 41 (-97.4%)
LogislandScalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-93.85%)
CookbookThe Data Engineering Cookbook
Stars: ✭ 9,829 (+523.67%)
GenieDistributed Big Data Orchestration Service
Stars: ✭ 1,544 (-2.03%)
Spark R Notebooks R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 109 (-93.08%)
MahaA framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.
Stars: ✭ 101 (-93.59%)
Streamxkafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)
Stars: ✭ 96 (-93.91%)
LabsResearch on distributed system
Stars: ✭ 73 (-95.37%)