All Projects → aut → Similar Projects or Alternatives

1453 Open source projects that are alternatives of or similar to aut

Sparkjni
A heterogeneous Apache Spark framework.
Stars: ✭ 11 (-90.09%)
Mutual labels:  big-data, spark
Hdfs Shell
HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (+5.41%)
Mutual labels:  big-data, hadoop
Scala Spark Tutorial
Project for James' Apache Spark with Scala course
Stars: ✭ 121 (+9.01%)
Mutual labels:  big-data, apache-spark
Drill
Apache Drill is a distributed MPP query layer for self describing data
Stars: ✭ 1,619 (+1358.56%)
Mutual labels:  big-data, hadoop
Calcite Avatica
Mirror of Apache Calcite - Avatica
Stars: ✭ 130 (+17.12%)
Mutual labels:  big-data, hadoop
fastdata-cluster
Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-81.98%)
Mutual labels:  spark, hadoop
Delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Stars: ✭ 3,903 (+3416.22%)
Mutual labels:  big-data, spark
Ozone
Scalable, redundant, and distributed object store for Apache Hadoop
Stars: ✭ 330 (+197.3%)
Mutual labels:  big-data, hadoop
Ignite
Apache Ignite
Stars: ✭ 4,027 (+3527.93%)
Mutual labels:  big-data, hadoop
Hive
Apache Hive
Stars: ✭ 4,031 (+3531.53%)
Mutual labels:  big-data, hadoop
Metorikku
A simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+225.23%)
Mutual labels:  big-data, spark
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+4866.67%)
Mutual labels:  big-data, spark
Hadoop For Geoevent
ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-95.5%)
Mutual labels:  big-data, hadoop
Spark Doc Zh
Apache Spark 官方文档中文版
Stars: ✭ 1,126 (+914.41%)
Mutual labels:  big-data, spark
Spark.jl
Julia binding for Apache Spark
Stars: ✭ 153 (+37.84%)
Mutual labels:  big-data, spark
Moosefs
MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+823.42%)
Mutual labels:  big-data, hadoop
Presto
The official home of the Presto distributed SQL query engine for big data
Stars: ✭ 12,957 (+11572.97%)
Mutual labels:  big-data, hadoop
Geopyspark
GeoTrellis for PySpark
Stars: ✭ 167 (+50.45%)
Mutual labels:  big-data, spark
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-28.83%)
Mutual labels:  big-data, spark
Bitcoin Value Predictor
[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Stars: ✭ 91 (-18.02%)
Mutual labels:  big-data, pyspark
Spark
Apache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+28384.68%)
Mutual labels:  big-data, spark
Bigdataclass
Two-day workshop that covers how to use R to interact databases and Spark
Stars: ✭ 110 (-0.9%)
Mutual labels:  big-data, spark
Awkward 0.x
Manipulate arrays of complex data structures as easily as Numpy.
Stars: ✭ 216 (+94.59%)
Mutual labels:  big-data, analysis
big-data-lite
Samples to the Oracle Big Data Lite VM
Stars: ✭ 41 (-63.06%)
Mutual labels:  big-data, hadoop
Logisland
Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-12.61%)
Mutual labels:  big-data, spark
Mastering Spark Sql Book
The Internals of Spark SQL
Stars: ✭ 234 (+110.81%)
Mutual labels:  spark, apache-spark
spark-acid
ACID Data Source for Apache Spark based on Hive ACID
Stars: ✭ 91 (-18.02%)
Mutual labels:  big-data, spark
BigData-News
基于Spark2.2新闻网大数据实时系统项目
Stars: ✭ 36 (-67.57%)
Mutual labels:  spark, hadoop
data processing course
Some class materials for a data processing course using PySpark
Stars: ✭ 50 (-54.95%)
Mutual labels:  spark, pyspark
Hyperspace
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Stars: ✭ 246 (+121.62%)
Mutual labels:  big-data, spark
Detecting-Malicious-URL-Machine-Learning
No description or website provided.
Stars: ✭ 47 (-57.66%)
Mutual labels:  big-data, apache-spark
spark3D
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …
Stars: ✭ 23 (-79.28%)
Mutual labels:  apache-spark, pyspark
v6.dooring.public
可视化大屏解决方案, 提供一套可视化编辑引擎, 助力个人或企业轻松定制自己的可视化大屏应用.
Stars: ✭ 323 (+190.99%)
Mutual labels:  big-data, big-data-analytics
Parquetviewer
Simple windows desktop application for viewing & querying Apache Parquet files
Stars: ✭ 145 (+30.63%)
Mutual labels:  big-data, apache-spark
Calcite
Apache Calcite
Stars: ✭ 2,816 (+2436.94%)
Mutual labels:  big-data, hadoop
Hydrograph
A visual ETL development and debugging tool for big data
Stars: ✭ 144 (+29.73%)
Mutual labels:  big-data, apache-spark
spark-twitter-sentiment-analysis
Sentiment Analysis of a Twitter Topic with Spark Structured Streaming
Stars: ✭ 55 (-50.45%)
Mutual labels:  apache-spark, pyspark
Eel Sdk
Big Data Toolkit for the JVM
Stars: ✭ 140 (+26.13%)
Mutual labels:  big-data, hadoop
learning-hadoop-and-spark
Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+31.53%)
Mutual labels:  apache-spark, hadoop
awesome-tools
curated list of awesome tools and libraries for specific domains
Stars: ✭ 31 (-72.07%)
Mutual labels:  big-data, apache-spark
incubator-linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+2115.32%)
Mutual labels:  spark, pyspark
Social-Network-Analysis-in-Python
Social Network Facebook Analysis (Python, Networkx)
Stars: ✭ 26 (-76.58%)
Mutual labels:  big-data, analysis
Eland
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+111.71%)
Mutual labels:  big-data, dataframe
learn-by-examples
Real-world Spark pipelines examples
Stars: ✭ 84 (-24.32%)
Mutual labels:  apache-spark, pyspark
mmtf-spark
Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Stars: ✭ 20 (-81.98%)
Mutual labels:  big-data, apache-spark
Sparkora
Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟
Stars: ✭ 51 (-54.05%)
Mutual labels:  apache-spark, pyspark
awesome-AI-kubernetes
❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (-14.41%)
Mutual labels:  big-data, spark
gan deeplearning4j
Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-82.88%)
Mutual labels:  big-data, apache-spark
spark-records
Bulletproof Apache Spark jobs with fast root cause analysis of failures.
Stars: ✭ 67 (-39.64%)
Mutual labels:  big-data, apache-spark
pyspark-ML-in-Colab
Pyspark in Google Colab: A simple machine learning (Linear Regression) model
Stars: ✭ 32 (-71.17%)
Mutual labels:  hadoop, pyspark
rastercube
rastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)
Stars: ✭ 15 (-86.49%)
Mutual labels:  big-data, hadoop
arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
Stars: ✭ 2,360 (+2026.13%)
Mutual labels:  big-data, dataframe
jupyterlab-sparkmonitor
JupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook
Stars: ✭ 78 (-29.73%)
Mutual labels:  apache-spark, pyspark
iis
Information Inference Service of the OpenAIRE system
Stars: ✭ 16 (-85.59%)
Mutual labels:  big-data, hadoop
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-78.38%)
Mutual labels:  apache-spark, hadoop
Springboard-Data-Science-Immersive
No description or website provided.
Stars: ✭ 52 (-53.15%)
Mutual labels:  hadoop, pyspark
clusterdock
clusterdock is a framework for creating Docker-based container clusters
Stars: ✭ 26 (-76.58%)
Mutual labels:  big-data, hadoop
check-engine
Data validation library for PySpark 3.0.0
Stars: ✭ 29 (-73.87%)
Mutual labels:  big-data, pyspark
ODSC India 2018
My presentation at ODSC India 2018 about Deep Learning with Apache Spark
Stars: ✭ 26 (-76.58%)
Mutual labels:  spark, pyspark
Javaorbigdata Interview
Java开发者或者大数据开发者面试知识点整理
Stars: ✭ 203 (+82.88%)
Mutual labels:  spark, hadoop
61-120 of 1453 similar projects