SuccinctEnabling queries on compressed data.
Stars: ✭ 257 (-79.89%)
Node ParquetNodeJS module to access apache parquet format files
Stars: ✭ 46 (-96.4%)
ClickhouseClickHouse® is a free analytics DBMS for big data
Stars: ✭ 21,089 (+1550.16%)
bandar-logMonitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 20 (-98.44%)
Vue Virtual Scroll List⚡️A vue component support big amount data list with high render performance and efficient.
Stars: ✭ 3,201 (+150.47%)
Kafka Streamsequivalent to kafka-streams 🐙 for nodejs ✨🐢🚀✨
Stars: ✭ 613 (-52.03%)
Data AcceleratorData Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (-80.67%)
bigtableTypeScript Bigtable Client with 🔋🔋 included.
Stars: ✭ 13 (-98.98%)
Aws Etl OrchestratorA serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (-80.83%)
PanoptesA Global Scale Network Telemetry Ecosystem
Stars: ✭ 80 (-93.74%)
Kafka UiOpen-Source Web GUI for Apache Kafka Management
Stars: ✭ 230 (-82%)
bigdata-funA complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-98.9%)
ElandPython Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (-81.61%)
OozieMirror of Apache Oozie
Stars: ✭ 602 (-52.9%)
Lite Virtual ListVirtual list component library supporting waterfall flow based on vue
Stars: ✭ 223 (-82.55%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-91.31%)
UsqlU-SQL Examples and Issue Tracking
Stars: ✭ 221 (-82.71%)
MoosefsMooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (-19.8%)
HelicalinsightHelical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.
Stars: ✭ 214 (-83.26%)
GiraphMirror of Apache Giraph
Stars: ✭ 569 (-55.48%)
ibmpairsopen source tools for interaction with IBM PAIRS:
Stars: ✭ 23 (-98.2%)
Data Science Live BookAn open source book to learn data science, data analysis and machine learning, suitable for all ages!
Stars: ✭ 193 (-84.9%)
RsparklingRSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-94.91%)
GunAn open source cybersecurity protocol for syncing decentralized graph data.
Stars: ✭ 15,172 (+1087.17%)
spark-acidACID Data Source for Apache Spark based on Hive ACID
Stars: ✭ 91 (-92.88%)
PachydermReproducible Data Science at Scale!
Stars: ✭ 5,305 (+315.1%)
PretzelJavascript full-stack framework for Big Data visualisation and analysis
Stars: ✭ 26 (-97.97%)
OapOptimized Analytics Package for Spark* Platform
Stars: ✭ 343 (-73.16%)
GDLibraryMatlab library for gradient descent algorithms: Version 1.0.1
Stars: ✭ 50 (-96.09%)
DvidDistributed, Versioned, Image-oriented Dataservice
Stars: ✭ 174 (-86.38%)
Attic PredictionioPredictionIO, a machine learning server for developers and ML engineers.
Stars: ✭ 12,522 (+879.81%)
QuiltQuilt is a self-organizing data hub for S3
Stars: ✭ 1,007 (-21.21%)
KeyviKeyvi - the key value index. It is an in-memory FST-based data structure highly optimized for size and lookup performance.
Stars: ✭ 161 (-87.4%)
PrestoThe official home of the Presto distributed SQL query engine for big data
Stars: ✭ 12,957 (+913.85%)
CouchdbSeamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
Stars: ✭ 5,166 (+304.23%)
Spark.jlJulia binding for Apache Spark
Stars: ✭ 153 (-88.03%)
alluxio-pyAlluxio Python client - Access Any Data Source with Python
Stars: ✭ 18 (-98.59%)
FiliEasily make RESTful web services for time series reporting with Big Data analytics engines like Druid and SQL Databases.
Stars: ✭ 151 (-88.18%)
CookbookThe Data Engineering Cookbook
Stars: ✭ 9,829 (+669.09%)
centurionKotlin Bigdata Toolkit
Stars: ✭ 320 (-74.96%)
HydrographA visual ETL development and debugging tool for big data
Stars: ✭ 144 (-88.73%)
ArkimeArkime (formerly Moloch) is an open source, large scale, full packet capturing, indexing, and database system.
Stars: ✭ 4,994 (+290.77%)
meepo异构存储数据迁移
Stars: ✭ 29 (-97.73%)
EgadsA Java package to automatically detect anomalies in large scale time-series data
Stars: ✭ 997 (-21.99%)
StroomStroom is a highly scalable data storage, processing and analysis platform.
Stars: ✭ 344 (-73.08%)
lcbo-apiA crawler and API server for Liquor Control Board of Ontario retail data
Stars: ✭ 152 (-88.11%)
hotmapWebGL Heatmap Viewer for Big Data and Bioinformatics
Stars: ✭ 13 (-98.98%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-93.82%)
AppdocsApplication Performance Optimization Summary
Stars: ✭ 1,169 (-8.53%)