Grouparoo🦘 The Grouparoo Monorepo - open source customer data sync framework
Stars: ✭ 334 (+160.94%)
ClickhouseClickHouse® is a free analytics DBMS for big data
Stars: ✭ 21,089 (+16375.78%)
Vue Virtual Scroll List⚡️A vue component support big amount data list with high render performance and efficient.
Stars: ✭ 3,201 (+2400.78%)
TezApache Tez
Stars: ✭ 313 (+144.53%)
Data AcceleratorData Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (+92.97%)
Esper TvEsper instance for TV news analysis
Stars: ✭ 37 (-71.09%)
Aws Etl OrchestratorA serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (+91.41%)
DeltaAn open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Stars: ✭ 3,903 (+2949.22%)
Kafka UiOpen-Source Web GUI for Apache Kafka Management
Stars: ✭ 230 (+79.69%)
RichdemHigh-performance Terrain and Hydrology Analysis
Stars: ✭ 127 (-0.78%)
ElandPython Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+83.59%)
FluidFluid, elastic data abstraction and acceleration for BigData/AI applications in cloud
Stars: ✭ 265 (+107.03%)
Lite Virtual ListVirtual list component library supporting waterfall flow based on vue
Stars: ✭ 223 (+74.22%)
UsqlU-SQL Examples and Issue Tracking
Stars: ✭ 221 (+72.66%)
MorpheusMorpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
Stars: ✭ 303 (+136.72%)
Awkward 0.xManipulate arrays of complex data structures as easily as Numpy.
Stars: ✭ 216 (+68.75%)
HelicalinsightHelical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.
Stars: ✭ 214 (+67.19%)
QcportalA client interface to the QCArchive Project (read-only image of QCFractal)
Stars: ✭ 29 (-77.34%)
Data Science Live BookAn open source book to learn data science, data analysis and machine learning, suitable for all ages!
Stars: ✭ 193 (+50.78%)
SmooksAn extensible Java framework for building XML and non-XML streaming applications
Stars: ✭ 293 (+128.91%)
GunAn open source cybersecurity protocol for syncing decentralized graph data.
Stars: ✭ 15,172 (+11753.13%)
Spark R Notebooks R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 109 (-14.84%)
FlumeMirror of Apache Flume
Stars: ✭ 2,200 (+1618.75%)
FlinkApache Flink is an open source project of The Apache Software Foundation (ASF).
The Apache Flink project originated from the Stratosphere research project.
Stars: ✭ 17,781 (+13791.41%)
DvidDistributed, Versioned, Image-oriented Dataservice
Stars: ✭ 174 (+35.94%)
SparkApache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+24601.56%)
Attic PredictionioPredictionIO, a machine learning server for developers and ML engineers.
Stars: ✭ 12,522 (+9682.81%)
TrinoOfficial repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+3478.91%)
KeyviKeyvi - the key value index. It is an in-memory FST-based data structure highly optimized for size and lookup performance.
Stars: ✭ 161 (+25.78%)
Uproot4ROOT I/O in pure Python and NumPy.
Stars: ✭ 80 (-37.5%)
PrestoThe official home of the Presto distributed SQL query engine for big data
Stars: ✭ 12,957 (+10022.66%)
Parquet Dotnet🏐 Apache Parquet for modern .NET
Stars: ✭ 276 (+115.63%)
Spark.jlJulia binding for Apache Spark
Stars: ✭ 153 (+19.53%)
PhoenixMirror of Apache Phoenix
Stars: ✭ 867 (+577.34%)
FiliEasily make RESTful web services for time series reporting with Big Data analytics engines like Druid and SQL Databases.
Stars: ✭ 151 (+17.97%)
DatahubThe Metadata Platform for the Modern Data Stack
Stars: ✭ 4,232 (+3206.25%)
ParquetviewerSimple windows desktop application for viewing & querying Apache Parquet files
Stars: ✭ 145 (+13.28%)
Hdfs ShellHDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (-8.59%)
HydrographA visual ETL development and debugging tool for big data
Stars: ✭ 144 (+12.5%)
bigstatsrR package for statistical tools with big matrices stored on disk.
Stars: ✭ 139 (+8.59%)
SparkjniA heterogeneous Apache Spark framework.
Stars: ✭ 11 (-91.41%)
Belajarpython.comOpen Source Indonesian Python Programming Tutorial Site
Stars: ✭ 141 (+10.16%)
mmtf-workshop-2018Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (-60.94%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-38.28%)
bigdata-funA complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-89.06%)
AzuredatalakeSamples and Docs for Azure Data Lake Store and Analytics
Stars: ✭ 128 (+0%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (+0%)
Report自动化配置报表平台。演示地址http://58.87.112.247/report 账号 visitor密码123456
Stars: ✭ 123 (-3.91%)
Just Dashboard📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+1080.47%)
LogislandScalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-24.22%)
Attic LensMirror of Apache Lens
Stars: ✭ 58 (-54.69%)
CoursesQuiz & Assignment of Coursera
Stars: ✭ 454 (+254.69%)
GDLibraryMatlab library for gradient descent algorithms: Version 1.0.1
Stars: ✭ 50 (-60.94%)