All Projects → incubator-linkis → Similar Projects or Alternatives

1499 Open source projects that are alternatives of or similar to incubator-linkis

Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (-59.9%)
Mutual labels:  spark, pyspark
Hadoop cookbook
Cookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-96.67%)
Mutual labels:  spark, hive
Hops Examples
Examples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-96.58%)
Mutual labels:  spark, hive
ODSC India 2018
My presentation at ODSC India 2018 about Deep Learning with Apache Spark
Stars: ✭ 26 (-98.94%)
Mutual labels:  spark, pyspark
Mmlspark
Simple and Distributed Machine Learning
Stars: ✭ 2,899 (+17.89%)
Mutual labels:  spark, pyspark
swordfish
Open-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (-98.58%)
Mutual labels:  spark, hive
Pyspark Cheatsheet
🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-95.61%)
Mutual labels:  spark, pyspark
Eat pyspark in 10 days
pyspark🍒🥭 is delicious,just eat it!😋😋
Stars: ✭ 116 (-95.28%)
Mutual labels:  spark, pyspark
Xsql
Unified SQL Analytics Engine Based on SparkSQL
Stars: ✭ 176 (-92.84%)
Mutual labels:  spark, hive
Hadoop Docker
基于Docker构建的Hadoop开发测试环境,包含Hadoop,Hive,HBase,Spark
Stars: ✭ 238 (-90.32%)
Mutual labels:  spark, hive
TiBigData
TiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (-92.19%)
Mutual labels:  presto, hive
liquibase-impala
Liquibase extension to add Impala Database support
Stars: ✭ 23 (-99.06%)
Mutual labels:  hive, impala
hadoop-etl-udfs
The Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Stars: ✭ 17 (-99.31%)
Mutual labels:  hive, udf
implyr
SQL backend to dplyr for Impala
Stars: ✭ 74 (-96.99%)
Mutual labels:  jdbc, impala
Hive Jdbc Uber Jar
Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
Stars: ✭ 188 (-92.35%)
Mutual labels:  hive, jdbc
Drill
Apache Drill is a distributed MPP query layer for self describing data
Stars: ✭ 1,619 (-34.16%)
Mutual labels:  hive, jdbc
HiveJdbcStorageHandler
No description or website provided.
Stars: ✭ 21 (-99.15%)
Mutual labels:  hive, jdbc
spark-extension
A library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-98.98%)
Mutual labels:  spark, pyspark
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-95.49%)
Mutual labels:  spark, pyspark
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-98.98%)
Mutual labels:  spark, pyspark
Quill
Compile-time Language Integrated Queries for Scala
Stars: ✭ 1,998 (-18.75%)
Mutual labels:  spark, jdbc
Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (-83.49%)
Mutual labels:  spark, pyspark
Sparkling Titanic
Training models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-99.51%)
Mutual labels:  spark, pyspark
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-93.9%)
Mutual labels:  spark, pyspark
hive to es
同步Hive数据仓库数据到Elasticsearch的小工具
Stars: ✭ 21 (-99.15%)
Mutual labels:  hive, impala
hive-jdbc-driver
An alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC
Stars: ✭ 31 (-98.74%)
Mutual labels:  hive, jdbc
data processing course
Some class materials for a data processing course using PySpark
Stars: ✭ 50 (-97.97%)
Mutual labels:  spark, pyspark
alluxio-py
Alluxio Python client - Access Any Data Source with Python
Stars: ✭ 18 (-99.27%)
Mutual labels:  storage
vok-orm
Mapping rows from a SQL database to POJOs in its simplest form
Stars: ✭ 13 (-99.47%)
Mutual labels:  jdbc
awesome-AI-kubernetes
❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (-96.14%)
Mutual labels:  spark
MiniRTS
A game engine to learn about game engine development
Stars: ✭ 99 (-95.97%)
Mutual labels:  engine
MonoGame.Forms
MonoGame.Forms is the easiest way of integrating a MonoGame render window into your Windows Forms project. It should make your life much easier, when you want to create your own editor environment.
Stars: ✭ 183 (-92.56%)
Mutual labels:  engine
OpenAM
OpenAM is an open access management solution that includes Authentication, SSO, Authorization, Federation, Entitlements and Web Services Security.
Stars: ✭ 476 (-80.64%)
Mutual labels:  jdbc
spark-druid-olap
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Stars: ✭ 286 (-88.37%)
Mutual labels:  spark
Azure-Databricks-NYC-Taxi-Workshop
An Azure Databricks workshop leveraging the New York Taxi and Limousine Commission Trip Records dataset
Stars: ✭ 71 (-97.11%)
Mutual labels:  pyspark
docker-hive
Docker image for Apache Hive Metastore
Stars: ✭ 42 (-98.29%)
Mutual labels:  hive
taucmdr
Performance engineering for the rest of us.
Stars: ✭ 26 (-98.94%)
Mutual labels:  storage
DTC
DTC is a high performance Distributed Table Cache system designed by JD.com that offering hotspot data cache for databases in order to reduce pressure of database and improve QPS.
Stars: ✭ 21 (-99.15%)
Mutual labels:  storage
desktop
Extendable calculator for the 21st Century ⚡
Stars: ✭ 85 (-96.54%)
Mutual labels:  engine
mutant-swarm
Mutation testing framework and code coverage for Hive SQL
Stars: ✭ 20 (-99.19%)
Mutual labels:  hive
elara
Elara DB is an easy to use, lightweight key-value database that can also be used as a fast in-memory cache. Manipulate data structures in-memory, encrypt database files and export data. 🎯
Stars: ✭ 93 (-96.22%)
Mutual labels:  storage
ProjectFNF
ProjectFNF 2.0, based on Psych Engine
Stars: ✭ 22 (-99.11%)
Mutual labels:  engine
sparkar-volts
An extensive non-reactive Typescript framework that eases the development experience in Spark AR
Stars: ✭ 15 (-99.39%)
Mutual labels:  spark
selectel-storage-php-class
PHP class for Selectel storage
Stars: ✭ 42 (-98.29%)
Mutual labels:  storage
spark-kubernetes
spark on kubernetes
Stars: ✭ 80 (-96.75%)
Mutual labels:  spark
Spark-Ar
Resources for Spark AR
Stars: ✭ 43 (-98.25%)
Mutual labels:  spark
experiments
Code examples for my blog posts
Stars: ✭ 21 (-99.15%)
Mutual labels:  spark
shamash
Autoscaling for Google Cloud Dataproc
Stars: ✭ 31 (-98.74%)
Mutual labels:  spark
PhantasmaChain
Blockchain with native storage and smart contract integration.
Stars: ✭ 74 (-96.99%)
Mutual labels:  storage
localstorage-ponyfill
Universal LocalStorage for browser and Node.js.
Stars: ✭ 52 (-97.89%)
Mutual labels:  storage
nft.storage-tools
🛠 Utilities for working with nft.storage.
Stars: ✭ 15 (-99.39%)
Mutual labels:  storage
memo
Android processing and secured library for managing SharedPreferences as key-value elements efficiently and structurally.
Stars: ✭ 18 (-99.27%)
Mutual labels:  storage
sentry-spark
Apache Spark Sentry Integration
Stars: ✭ 14 (-99.43%)
Mutual labels:  spark
spica
Spica is a development engine to build fast & efficient applications.
Stars: ✭ 77 (-96.87%)
Mutual labels:  engine
fastdata-cluster
Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-99.19%)
Mutual labels:  spark
splink
Implementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (-92.64%)
Mutual labels:  spark
pvc-autoresizer
Auto-resize PersistentVolumeClaim objects based on Prometheus metrics
Stars: ✭ 124 (-94.96%)
Mutual labels:  storage
spark-stringmetric
Spark functions to run popular phonetic and string matching algorithms
Stars: ✭ 51 (-97.93%)
Mutual labels:  spark
visualize-data-with-python
A Jupyter notebook using some standard techniques for data science and data engineering to analyze data for the 2017 flooding in Houston, TX.
Stars: ✭ 60 (-97.56%)
Mutual labels:  spark
jitterphysics
A cross-platform, realtime physics engine for all .NET apps.
Stars: ✭ 327 (-86.7%)
Mutual labels:  engine
61-120 of 1499 similar projects