Optimus🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (-59.9%)
Hadoop cookbookCookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-96.67%)
Hops ExamplesExamples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-96.58%)
ODSC India 2018My presentation at ODSC India 2018 about Deep Learning with Apache Spark
Stars: ✭ 26 (-98.94%)
MmlsparkSimple and Distributed Machine Learning
Stars: ✭ 2,899 (+17.89%)
swordfishOpen-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (-98.58%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-95.61%)
XsqlUnified SQL Analytics Engine Based on SparkSQL
Stars: ✭ 176 (-92.84%)
Hadoop Docker基于Docker构建的Hadoop开发测试环境,包含Hadoop,Hive,HBase,Spark
Stars: ✭ 238 (-90.32%)
TiBigDataTiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (-92.19%)
liquibase-impalaLiquibase extension to add Impala Database support
Stars: ✭ 23 (-99.06%)
hadoop-etl-udfsThe Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Stars: ✭ 17 (-99.31%)
implyrSQL backend to dplyr for Impala
Stars: ✭ 74 (-96.99%)
Hive Jdbc Uber JarHive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
Stars: ✭ 188 (-92.35%)
DrillApache Drill is a distributed MPP query layer for self describing data
Stars: ✭ 1,619 (-34.16%)
spark-extensionA library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-98.98%)
autThe Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-95.49%)
basinBasin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-98.98%)
QuillCompile-time Language Integrated Queries for Scala
Stars: ✭ 1,998 (-18.75%)
Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (-83.49%)
Sparkling TitanicTraining models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-99.51%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-93.9%)
hive to es同步Hive数据仓库数据到Elasticsearch的小工具
Stars: ✭ 21 (-99.15%)
hive-jdbc-driverAn alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC
Stars: ✭ 31 (-98.74%)
alluxio-pyAlluxio Python client - Access Any Data Source with Python
Stars: ✭ 18 (-99.27%)
vok-ormMapping rows from a SQL database to POJOs in its simplest form
Stars: ✭ 13 (-99.47%)
awesome-AI-kubernetes❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (-96.14%)
MiniRTSA game engine to learn about game engine development
Stars: ✭ 99 (-95.97%)
MonoGame.FormsMonoGame.Forms is the easiest way of integrating a MonoGame render window into your Windows Forms project. It should make your life much easier, when you want to create your own editor environment.
Stars: ✭ 183 (-92.56%)
OpenAMOpenAM is an open access management solution that includes Authentication, SSO, Authorization, Federation, Entitlements and Web Services Security.
Stars: ✭ 476 (-80.64%)
spark-druid-olapSparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Stars: ✭ 286 (-88.37%)
docker-hiveDocker image for Apache Hive Metastore
Stars: ✭ 42 (-98.29%)
taucmdrPerformance engineering for the rest of us.
Stars: ✭ 26 (-98.94%)
DTCDTC is a high performance Distributed Table Cache system designed by JD.com that offering hotspot data cache for databases in order to reduce pressure of database and improve QPS.
Stars: ✭ 21 (-99.15%)
desktopExtendable calculator for the 21st Century ⚡
Stars: ✭ 85 (-96.54%)
mutant-swarmMutation testing framework and code coverage for Hive SQL
Stars: ✭ 20 (-99.19%)
elaraElara DB is an easy to use, lightweight key-value database that can also be used as a fast in-memory cache. Manipulate data structures in-memory, encrypt database files and export data. 🎯
Stars: ✭ 93 (-96.22%)
ProjectFNFProjectFNF 2.0, based on Psych Engine
Stars: ✭ 22 (-99.11%)
sparkar-voltsAn extensive non-reactive Typescript framework that eases the development experience in Spark AR
Stars: ✭ 15 (-99.39%)
Spark-ArResources for Spark AR
Stars: ✭ 43 (-98.25%)
experimentsCode examples for my blog posts
Stars: ✭ 21 (-99.15%)
shamashAutoscaling for Google Cloud Dataproc
Stars: ✭ 31 (-98.74%)
PhantasmaChainBlockchain with native storage and smart contract integration.
Stars: ✭ 74 (-96.99%)
memoAndroid processing and secured library for managing SharedPreferences as key-value elements efficiently and structurally.
Stars: ✭ 18 (-99.27%)
sentry-sparkApache Spark Sentry Integration
Stars: ✭ 14 (-99.43%)
spicaSpica is a development engine to build fast & efficient applications.
Stars: ✭ 77 (-96.87%)
fastdata-clusterFast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-99.19%)
splinkImplementation of Fellegi-Sunter's canonical model of record linkage in Apache Spark, including EM algorithm to estimate parameters
Stars: ✭ 181 (-92.64%)
pvc-autoresizerAuto-resize PersistentVolumeClaim objects based on Prometheus metrics
Stars: ✭ 124 (-94.96%)
spark-stringmetricSpark functions to run popular phonetic and string matching algorithms
Stars: ✭ 51 (-97.93%)
visualize-data-with-pythonA Jupyter notebook using some standard techniques for data science and data engineering to analyze data for the 2017 flooding in Houston, TX.
Stars: ✭ 60 (-97.56%)
jitterphysicsA cross-platform, realtime physics engine for all .NET apps.
Stars: ✭ 327 (-86.7%)