Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+961.9%)
ArchivesparkAn Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
Stars: ✭ 111 (-11.9%)
Repository个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-26.98%)
Istio WorkshopIn this workshop, you'll learn how to install and configure Istio, an open source framework for connecting, securing, and managing microservices, on Google Kubernetes Engine, Google’s hosted Kubernetes product. You will also deploy an Istio-enabled multi-service application
Stars: ✭ 120 (-4.76%)
Lambda ArchApplying Lambda Architecture with Spark, Kafka, and Cassandra.
Stars: ✭ 111 (-11.9%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-3.17%)
Ethereum Etl AirflowAirflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. What datasets do you want to be added to Ethereum ETL? Vote here: https://blockchain-etl.convas.io.
Stars: ✭ 89 (-29.37%)
ElassandraElassandra = Elasticsearch + Apache Cassandra
Stars: ✭ 1,610 (+1177.78%)
Ammonite SparkRun spark calculations from Ammonite
Stars: ✭ 88 (-30.16%)
Parquet IndexSpark SQL index for Parquet tables
Stars: ✭ 109 (-13.49%)
MaisUniversalizando o acesso a dados no Brasil. Docs: https://basedosdados.github.io/mais/
Stars: ✭ 122 (-3.17%)
Hands DetectionHands video tracker using the Tensorflow Object Detection API and Faster RCNN model. The data used is the Hand Dataset from University of Oxford.
Stars: ✭ 87 (-30.95%)
Pyspark Cheatsheet🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (-14.29%)
CuesheetA framework for writing Spark 2.x applications in a pretty way
Stars: ✭ 86 (-31.75%)
Esp V2A service proxy that provides API management capabilities using Google Service Infrastructure.
Stars: ✭ 120 (-4.76%)
FlintWebex Bot SDK for Node.js (deprecated in favor of https://github.com/webex/webex-bot-node-framework)
Stars: ✭ 85 (-32.54%)
Microservices DemoSample cloud-native application with 10 microservices showcasing Kubernetes, Istio, gRPC and OpenCensus.
Stars: ✭ 11,369 (+8923.02%)
Hops ExamplesExamples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-33.33%)
BeastLoad data from Kafka to any data warehouse
Stars: ✭ 119 (-5.56%)
Spark StatesCustom state store providers for Apache Spark
Stars: ✭ 83 (-34.13%)
Seldon ServerMachine Learning Platform and Recommendation Engine built on Kubernetes
Stars: ✭ 1,435 (+1038.89%)
LeharVisualize data using relative ordering
Stars: ✭ 81 (-35.71%)
Spark Infotheoretic Feature SelectionThis package contains a generic implementation of greedy Information Theoretic Feature Selection (FS) methods. The implementation is based on the common theoretic framework presented by Gavin Brown. Implementations of mRMR, InfoGain, JMI and other commonly used FS filters are provided.
Stars: ✭ 123 (-2.38%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-37.3%)
Spark On K8s OperatorKubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Stars: ✭ 1,780 (+1312.7%)
Spark LucenerddSpark RDD with Lucene's query and entity linkage capabilities
Stars: ✭ 114 (-9.52%)
SplashSplash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
Stars: ✭ 105 (-16.67%)
Ds CheatsheetsList of Data Science Cheatsheets to rule the world
Stars: ✭ 9,452 (+7401.59%)
ZparkioBoiler plate framework to use Spark and ZIO together.
Stars: ✭ 121 (-3.97%)
Apache Spark Hands OnEducational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-41.27%)
Drachtio Freeswitch ModulesA collection of open-sourced freeswitch modules that I use in various drachtio applications
Stars: ✭ 73 (-42.06%)
Luigi WarehouseA luigi powered analytics / warehouse stack
Stars: ✭ 72 (-42.86%)
Kubernetes NexusRun Sonatype Nexus Repository Manager OSS on top of Kubernetes (GKE). Includes instructions for automated backups (GCS) and day-to-day usage.
Stars: ✭ 122 (-3.17%)
TerrastackThis project is archived, but the idea of Terrastack lives on in the Terraform CDK. - https://github.com/hashicorp/terraform-cdk
Stars: ✭ 71 (-43.65%)
Linq To BigqueryLINQ to BigQuery is C# LINQ Provider for Google BigQuery. It also enables Desktop GUI Client with LINQPad and plug-in driver.
Stars: ✭ 69 (-45.24%)
Python BigdataData science and Big Data with Python
Stars: ✭ 112 (-11.11%)
Sql RunnerRun templatable playbooks of SQL scripts in series and parallel on Redshift, PostgreSQL, BigQuery and Snowflake
Stars: ✭ 68 (-46.03%)
Fast MrmrAn improved implementation of the classical feature selection method: minimum Redundancy and Maximum Relevance (mRMR).
Stars: ✭ 67 (-46.83%)
KontextfreiWriting application logic for Spark jobs that can be unit-tested without a SparkContext
Stars: ✭ 67 (-46.83%)
ThingsboardOpen-source IoT Platform - Device management, data collection, processing and visualization.
Stars: ✭ 10,526 (+8253.97%)
Gocloud☁️ Go API for open cloud
Stars: ✭ 112 (-11.11%)
AlmondA Scala kernel for Jupyter
Stars: ✭ 1,354 (+974.6%)