Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+14598.67%)
Azure Event Hubs SparkEnabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Stars: ✭ 140 (-6.67%)
Docker SupersetRepository for Docker Image of Apache-Superset. [Docker Image: https://hub.docker.com/r/abhioncbr/docker-superset]
Stars: ✭ 86 (-42.67%)
Pyspark Setup DemoDemo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks
Stars: ✭ 24 (-84%)
HyperspaceAn open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Stars: ✭ 246 (+64%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+44%)
DatafusionDataFusion has now been donated to the Apache Arrow project
Stars: ✭ 611 (+307.33%)
DuckdbDuckDB is an in-process SQL OLAP Database Management System
Stars: ✭ 4,014 (+2576%)
CalciteApache Calcite
Stars: ✭ 2,816 (+1777.33%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+994.67%)
Spark.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+1047.33%)
KoalasKoalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+1929.33%)
SpartaReal Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+242%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+396.67%)
SynapseMLSimple and Distributed Machine Learning
Stars: ✭ 3,355 (+2136.67%)
sparkucxA high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (-78.67%)
fastdata-clusterFast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-86.67%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-83.33%)
Data Algorithms Book MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (+532.67%)
Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+557.33%)
BigdlBuilding Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+2442%)
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+536%)
MoosefsMooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+583.33%)
EventqlDistributed "massively parallel" SQL query engine
Stars: ✭ 1,121 (+647.33%)
God Of Bigdata专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+3905.33%)
KyuubiKyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (+142%)
ScriptisScriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (+364%)
ZeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+3575.33%)
MobiusC# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+519.33%)
Pyspark ExamplesCode examples on Apache Spark using python
Stars: ✭ 58 (-61.33%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-57.33%)
LocustdbMassively parallel, high performance analytics database that will rapidly devour all of your data.
Stars: ✭ 1,250 (+733.33%)
Hops ExamplesExamples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-44%)
GraphjinGraphJin - Build APIs in 5 minutes with GraphQL. An instant GraphQL to SQL compiler.
Stars: ✭ 1,264 (+742.67%)
Evolutility Server NodeModel-driven REST or GraphQL backend for CRUD and more, written in Javascript, using Node.js, Express, and PostgreSQL.
Stars: ✭ 84 (-44%)
Training MaterialA collection of code examples as well as presentations for training purposes
Stars: ✭ 85 (-43.33%)
CuesheetA framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-42.67%)
ElectrocrudDatabase CRUD Application Built on Electron | MySQL, Postgres, SQLite
Stars: ✭ 1,267 (+744.67%)
React Native Firebase🔥 A well-tested feature-rich modular Firebase implementation for React Native. Supports both iOS & Android platforms for all Firebase services.
Stars: ✭ 9,674 (+6349.33%)
Spark Nlp ModelsModels and Pipelines for the Spark NLP library
Stars: ✭ 88 (-41.33%)
QtlA friendly and lightweight C++ database library for MySQL, PostgreSQL, SQLite and ODBC.
Stars: ✭ 92 (-38.67%)
Wifi基于wifi抓取信息的大数据查询分析系统
Stars: ✭ 93 (-38%)