Fluent Plugin S3Amazon S3 input and output plugin for Fluentd
Stars: ✭ 276 (+210.11%)
NodbNoDB isn't a database.. but it sort of looks like one.
Stars: ✭ 353 (+296.63%)
Aws Airflow StackTurbine: the bare metals that gets you Airflow
Stars: ✭ 352 (+295.51%)
DatavecETL Library for Machine Learning - data pipelines, data munging and wrangling
Stars: ✭ 272 (+205.62%)
Kiba PlusKiba enhancement for Ruby ETL.
Stars: ✭ 47 (-47.19%)
DdlparseDDL parase and Convert to BigQuery JSON schema and DDL statements
Stars: ✭ 52 (-41.57%)
Enterprise gatewayA lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.
Stars: ✭ 412 (+362.92%)
PglogicalLogical Replication extension for PostgreSQL 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
Stars: ✭ 455 (+411.24%)
Devops Python Tools80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+356.18%)
PointblankData validation and organization of metadata for data frames and database tables
Stars: ✭ 480 (+439.33%)
EvolveDatabase migration tool for .NET and .NET Core projects. Inspired by Flyway.
Stars: ✭ 477 (+435.96%)
HelkThe Hunting ELK
Stars: ✭ 3,097 (+3379.78%)
Rumble⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-34.83%)
JustenoughscalaforsparkA tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Stars: ✭ 538 (+504.49%)
LabnotebookLabNotebook is a tool that allows you to flexibly monitor, record, save, and query all your machine learning experiments.
Stars: ✭ 526 (+491.01%)
Django S3directDirectly upload files to S3 compatible services with Django.
Stars: ✭ 570 (+540.45%)
S3 BenchmarkMeasure Amazon S3's performance from any location.
Stars: ✭ 525 (+489.89%)
PixiedustPython Helper library for Jupyter Notebooks
Stars: ✭ 998 (+1021.35%)
Aws UtilitiesDocker images and scripts to deploy to AWS
Stars: ✭ 52 (-41.57%)
Pyspark ExamplesCode examples on Apache Spark using python
Stars: ✭ 58 (-34.83%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+6255.06%)
Aws Mobile React SampleA React Starter App that displays how web developers can integrate their front end with AWS on the backend. The App interacts with AWS Cognito, API Gateway, Lambda and DynamoDB on the backend.
Stars: ✭ 650 (+630.34%)
FalconFree, open-source SQL client for Windows and Mac 🦅
Stars: ✭ 4,848 (+5347.19%)
PgbackrestReliable PostgreSQL Backup & Restore
Stars: ✭ 766 (+760.67%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+737.08%)
Github To S3 Lambda Deployer⚓️ GitHub webhook extension for uploading static pages to AWS S3 directly after commiting to master via Lambda written in Node.js
Stars: ✭ 23 (-74.16%)
Aws Auto Terminate Idle EmrAWS Auto Terminate Idle AWS EMR Clusters Framework is an AWS based solution using AWS CloudWatch and AWS Lambda using a Python script that is using Boto3 to terminate AWS EMR clusters that have been idle for a specified period of time.
Stars: ✭ 21 (-76.4%)
Awslib scalaAn idiomatic Scala wrapper around the AWS Java SDK
Stars: ✭ 20 (-77.53%)
SparkmagicJupyter magics and kernels for working with remote Spark clusters
Stars: ✭ 954 (+971.91%)
PantherDetect threats with log data and improve cloud security posture
Stars: ✭ 885 (+894.38%)
Aws S3 ScalaScala client for Amazon S3
Stars: ✭ 35 (-60.67%)
Vagrant ProjectsVagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR
Stars: ✭ 34 (-61.8%)
TedsdsApache Spark - Turbofan Engine Degradation Simulation Data Set example in Apache Spark
Stars: ✭ 14 (-84.27%)
Ether sqlA python library to push ethereum blockchain data into an sql database.
Stars: ✭ 41 (-53.93%)
Aws Testing LibraryChai (https://chaijs.com) and Jest (https://jestjs.io/) assertions for testing services built with aws
Stars: ✭ 52 (-41.57%)
Nagios Plugins450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
Stars: ✭ 1,000 (+1023.6%)
Scrapy S3pipelineScrapy pipeline to store chunked items into Amazon S3 or Google Cloud Storage bucket.
Stars: ✭ 57 (-35.96%)
Dbbench🏋️ dbbench is a simple database benchmarking tool which supports several databases and own scripts
Stars: ✭ 52 (-41.57%)
DiscreetlyETLy is an add-on dashboard service on top of Apache Airflow.
Stars: ✭ 60 (-32.58%)
Dockerfiles50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu
Stars: ✭ 847 (+851.69%)
Terraform Aws S3 Log StorageThis module creates an S3 bucket suitable for receiving logs from other AWS services such as S3, CloudFront, and CloudTrail
Stars: ✭ 65 (-26.97%)
W2vWord2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (-28.09%)
AwsSwift wrapper around AWS API
Stars: ✭ 67 (-24.72%)
Etl with pythonETL with Python - Taught at DWH course 2017 (TAU)
Stars: ✭ 68 (-23.6%)
Aws InventoryPython script for AWS resources inventory (cheaper than AWS Config)
Stars: ✭ 69 (-22.47%)
Terraform Aws AirflowTerraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker with CeleryExecutor
Stars: ✭ 69 (-22.47%)
Pysparkgeoanalysis🌐 Interactive Workshop on GeoAnalysis using PySpark
Stars: ✭ 63 (-29.21%)
Sql RunnerRun templatable playbooks of SQL scripts in series and parallel on Redshift, PostgreSQL, BigQuery and Snowflake
Stars: ✭ 68 (-23.6%)
TransporterSync data between persistence engines, like ETL only not stodgy
Stars: ✭ 1,175 (+1220.22%)