All Projects → sparklanes → Similar Projects or Alternatives

744 Open source projects that are alternatives of or similar to sparklanes

AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (+41.18%)
Mutual labels:  etl
Hands On Devops
A hands-on DevOps course covering the culture, methods and repeated practices of modern software development involving Packer, Vagrant, VirtualBox, Ansible, Kubernetes, K3s, MetalLB, Traefik, Docker-Compose, Docker, Taiga, GitLab, Drone CI, SonarQube, Selenium, InSpec, Alpine 3.10, Ubuntu-bionic, CentOS 7...
Stars: ✭ 196 (+1052.94%)
Mutual labels:  pipeline
cpp-can-isotp
C++ implementation of CAN ISO 15765-2 also known as CAN ISO transport protocol. CPP CAN isotp.
Stars: ✭ 14 (-17.65%)
Mutual labels:  etl
rec-core
Data pipelining service
Stars: ✭ 19 (+11.76%)
Mutual labels:  data-processing
arthur-redshift-etl
ELT Code for your Data Warehouse
Stars: ✭ 22 (+29.41%)
Mutual labels:  etl
Pipeline.rs
☔️ => ⛅️ => ☀️
Stars: ✭ 188 (+1005.88%)
Mutual labels:  pipeline
open-semantic-desktop-search
Virtual Machine for Desktop Search with Open Semantic Search
Stars: ✭ 22 (+29.41%)
Mutual labels:  etl
architect big data solutions with spark
code, labs and lectures for the course
Stars: ✭ 40 (+135.29%)
Mutual labels:  etl
gamechanger-data
GAMECHANGER aspires to be the Department’s trusted solution for evidence-based, data-driven decision-making across the universe of DoD requirements
Stars: ✭ 17 (+0%)
Mutual labels:  etl
Zumis
zUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs
Stars: ✭ 178 (+947.06%)
Mutual labels:  pipeline
cardano-py
Python3 lib and cli for operating a Cardano Passive Node and using the API's. (PRE-ALPHA)
Stars: ✭ 17 (+0%)
Mutual labels:  etl
blockchain-etl-streaming
Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (+235.29%)
Mutual labels:  etl
spdr-etf-holdings
ETL for the SPDR ETF holdings XLS documents
Stars: ✭ 14 (-17.65%)
Mutual labels:  etl
Pypyr
pypyr task-runner cli & api for automation pipelines. Automate anything by combining commands, different scripts in different languages & applications into one pipeline process.
Stars: ✭ 173 (+917.65%)
Mutual labels:  pipeline
TEAM
The Taxonomy for ETL Automation Metadata (TEAM) is a metadata management tool for data warehouse automation. It is part of the ecosystem for data warehouse automation, alongside the Virtual Data Warehouse pattern manager and the generic schema for Data Warehouse Automation.
Stars: ✭ 27 (+58.82%)
Mutual labels:  etl
stargate
An Apache Pulsar client written in Elixir
Stars: ✭ 33 (+94.12%)
Mutual labels:  data-processing
koza
Data transformation framework for LinkML data models
Stars: ✭ 21 (+23.53%)
Mutual labels:  etl
Rnaseq Workflow
A repository for setting up a RNAseq workflow
Stars: ✭ 170 (+900%)
Mutual labels:  pipeline
oesophagus
Enterprise Grade Single-Step Streaming Data Infrastructure Setup. (Under Development)
Stars: ✭ 12 (-29.41%)
Mutual labels:  etl
pyspark-ML-in-Colab
Pyspark in Google Colab: A simple machine learning (Linear Regression) model
Stars: ✭ 32 (+88.24%)
Mutual labels:  pyspark
DataXServer
为DataX(https://github.com/alibaba/DataX) 提供远程多语言调用(ThriftServer,HttpServer) 分布式运行(DataX on YARN) 功能
Stars: ✭ 130 (+664.71%)
Mutual labels:  etl
Plex
Open Source Pipeline for Maya, Houdini, 3ds Max and Nuke .
Stars: ✭ 170 (+900%)
Mutual labels:  pipeline
MIPS-pipeline-processor
A pipelined implementation of the MIPS processor featuring hazard detection as well as forwarding
Stars: ✭ 92 (+441.18%)
Mutual labels:  pipeline
Spark Practice
Apache Spark (PySpark) Practice on Real Data
Stars: ✭ 200 (+1076.47%)
Mutual labels:  pyspark
Unity resources
A list of resources and tutorials for those doing programming in Unity.
Stars: ✭ 170 (+900%)
Mutual labels:  pipeline
Spark Iforest
Isolation Forest on Spark
Stars: ✭ 166 (+876.47%)
Mutual labels:  pyspark
Sparkora
Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟
Stars: ✭ 51 (+200%)
Mutual labels:  pyspark
Linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+13564.71%)
Mutual labels:  pyspark
Cloud Dev
云研发,是一种生于云上的闭环 + 代码化的软件开发方式。它可以让业务人员、开发人员、运营人员等在同一个云端共同协作、透明化地完成整个软件的生命周期(需求、设计、编码、构建、部署、运营),而非相互隔离,又或者是借助于多个软件才能完成工作。
Stars: ✭ 164 (+864.71%)
Mutual labels:  pipeline
bacannot
Generic but comprehensive pipeline for prokaryotic genome annotation and interrogation with interactive reports and shiny app.
Stars: ✭ 51 (+200%)
Mutual labels:  pipeline
Cc Pyspark
Process Common Crawl data with Python and Spark
Stars: ✭ 147 (+764.71%)
Mutual labels:  pyspark
Operator
Kubernetes operator to manage installation, updation and uninstallation of tektoncd projects (pipeline, …)
Stars: ✭ 161 (+847.06%)
Mutual labels:  pipeline
Repo 2019
BERT, AWS RDS, AWS Forecast, EMR Spark Cluster, Hive, Serverless, Google Assistant + Raspberry Pi, Infrared, Google Cloud Platform Natural Language, Anomaly detection, Tensorflow, Mathematics
Stars: ✭ 133 (+682.35%)
Mutual labels:  pyspark
ngs pipeline
Exome/Capture/RNASeq Pipeline Implementation using snakemake
Stars: ✭ 40 (+135.29%)
Mutual labels:  pipeline
Pyspark Cheatsheet
🐍 Quick reference guide to common patterns & functions in PySpark.
Stars: ✭ 108 (+535.29%)
Mutual labels:  pyspark
Spacy Wordnet
spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface
Stars: ✭ 156 (+817.65%)
Mutual labels:  pipeline
Pyspark Stubs
Apache (Py)Spark type annotations (stub files).
Stars: ✭ 98 (+476.47%)
Mutual labels:  pyspark
Apos.Content
Content builder library for MonoGame.
Stars: ✭ 14 (-17.65%)
Mutual labels:  pipeline
Spark Py Notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+7770.59%)
Mutual labels:  pyspark
Ects
Elastic Crontab System 简单易用的分布式定时任务管理系统
Stars: ✭ 156 (+817.65%)
Mutual labels:  pipeline
Bitcoin Value Predictor
[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Stars: ✭ 91 (+435.29%)
Mutual labels:  pyspark
ImcSegmentationPipeline
A pixel classification based multiplexed image segmentation pipeline
Stars: ✭ 62 (+264.71%)
Mutual labels:  pipeline
W2v
Word2Vec models with Twitter data using Spark. Blog:
Stars: ✭ 64 (+276.47%)
Mutual labels:  pyspark
Open Solution Toxic Comments
Open solution to the Toxic Comment Classification Challenge
Stars: ✭ 154 (+805.88%)
Mutual labels:  pipeline
Petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Stars: ✭ 1,108 (+6417.65%)
Mutual labels:  pyspark
MTBseq source
MTBseq is an automated pipeline for mapping, variant calling and detection of resistance mediating and phylogenetic variants from illumina whole genome sequence data of Mycobacterium tuberculosis complex isolates.
Stars: ✭ 26 (+52.94%)
Mutual labels:  pipeline
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+5700%)
Mutual labels:  pyspark
STOCK-RETURN-PREDICTION-USING-KNN-SVM-GUASSIAN-PROCESS-ADABOOST-TREE-REGRESSION-AND-QDA
Forecast stock prices using machine learning approach. A time series analysis. Employ the Use of Predictive Modeling in Machine Learning to Forecast Stock Return. Approach Used by Hedge Funds to Select Tradeable Stocks
Stars: ✭ 94 (+452.94%)
Mutual labels:  pipeline
proposal-hack-pipes
Old specification for Hack pipes in JavaScript. Please go to the new specification.
Stars: ✭ 87 (+411.76%)
Mutual labels:  pipeline
Live log analyzer spark
Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.
Stars: ✭ 14 (-17.65%)
Mutual labels:  pyspark
needlestack
Multi-sample somatic variant caller
Stars: ✭ 45 (+164.71%)
Mutual labels:  pipeline
katana-skipper
Simple and flexible ML workflow engine
Stars: ✭ 234 (+1276.47%)
Mutual labels:  pipeline
Motorway
Cloud ready pure-python streaming data pipeline library
Stars: ✭ 150 (+782.35%)
Mutual labels:  pipeline
flamingo
FreeCAD - flamingo workbench
Stars: ✭ 30 (+76.47%)
Mutual labels:  pipeline
kafka-connect-datagen
A Kafka Connect source connector that generates data for tests
Stars: ✭ 27 (+58.82%)
Mutual labels:  etl
rivery cli
Rivery CLI
Stars: ✭ 16 (-5.88%)
Mutual labels:  etl
mech
🦾 Main repository for the Mech programming language. Start here!
Stars: ✭ 135 (+694.12%)
Mutual labels:  data-processing
web-click-flow
网站点击流离线日志分析
Stars: ✭ 14 (-17.65%)
Mutual labels:  etl
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (+41.18%)
Mutual labels:  etl
prospectr
R package: Misc. Functions for Processing and Sample Selection of Spectroscopic Data
Stars: ✭ 26 (+52.94%)
Mutual labels:  preprocessing
601-660 of 744 similar projects