All Projects → Sparkflow → Similar Projects or Alternatives

573 Open source projects that are alternatives of or similar to Sparkflow

Mobius
C# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+229.43%)
Mutual labels:  dataframe, apache-spark
isarn-sketches-spark
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (-90.07%)
Mutual labels:  apache-spark, dataframe
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Stars: ✭ 111 (-60.64%)
Mutual labels:  apache-spark, dataframe
Pdpipe
Easy pipelines for pandas DataFrames.
Stars: ✭ 590 (+109.22%)
Mutual labels:  dataframe, pipeline
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-46.81%)
Mutual labels:  dataframe, apache-spark
hyperdrive
Extensible streaming ingestion pipeline on top of Apache Spark
Stars: ✭ 31 (-89.01%)
Mutual labels:  apache-spark, pipeline
spark-streaming-visualize
Simple demonstration of how to build a complex real time machine learning visualization tool.
Stars: ✭ 16 (-94.33%)
Mutual labels:  apache-spark
ctdna-pipeline
A simplified pipeline for ctDNA sequencing data analysis
Stars: ✭ 29 (-89.72%)
Mutual labels:  pipeline
JT1078Gateway
基于Pipeline实现的JT1078Gateway支持TCP/UDP,目前只支持http-flv、ws-flv、hls三种拉流方式
Stars: ✭ 50 (-82.27%)
Mutual labels:  pipeline
Dominando-Pandas
Este repositório está destinado ao processo de aprendizagem da biblioteca Pandas.
Stars: ✭ 22 (-92.2%)
Mutual labels:  dataframe
pyrealtime
Realtime data processing and plotting pipelines in Python
Stars: ✭ 62 (-78.01%)
Mutual labels:  pipeline
pipeline-editor
Cloud Pipelines Editor is a web app that allows the users to build and run Machine Learning pipelines without having to set up development environment.
Stars: ✭ 22 (-92.2%)
Mutual labels:  pipeline
Credit
An example project that predicts risk of credit card default using a Logistic Regression classifier and a 30,000 sample dataset.
Stars: ✭ 18 (-93.62%)
Mutual labels:  pipeline
gitlab-merger-bot
GitLab Merger Bot
Stars: ✭ 23 (-91.84%)
Mutual labels:  pipeline
godot-exporter
Godot Engine Automation Pipeline Android – iOS – Linux – MacOS – Windows – HTML5 – Itch.io.
Stars: ✭ 54 (-80.85%)
Mutual labels:  pipeline
dropEst
Pipeline for initial analysis of droplet-based single-cell RNA-seq data
Stars: ✭ 71 (-74.82%)
Mutual labels:  pipeline
kedro
A Python framework for creating reproducible, maintainable and modular data science code.
Stars: ✭ 6,068 (+2051.77%)
Mutual labels:  pipeline
bistro
A library to build and execute typed scientific workflows
Stars: ✭ 43 (-84.75%)
Mutual labels:  pipeline
connector-x
Fastest library to load data from DB to DataFrames in Rust and Python
Stars: ✭ 550 (+95.04%)
Mutual labels:  dataframe
image-processing-pipeline
An image build orchestrator for the modern web
Stars: ✭ 43 (-84.75%)
Mutual labels:  pipeline
Rust Dataframe
A Rust DataFrame implementation, built on Apache Arrow
Stars: ✭ 271 (-3.9%)
Mutual labels:  dataframe
spot-termination-exporter
Prometheus spot instance exporter to monitor AWS instance termination with Hollowtrees
Stars: ✭ 30 (-89.36%)
Mutual labels:  pipeline
pipeline-as-code-with-jenkins
Pipeline as Code with Jenkins
Stars: ✭ 56 (-80.14%)
Mutual labels:  pipeline
spark-gradle-template
Apache Spark in your IDE with gradle
Stars: ✭ 39 (-86.17%)
Mutual labels:  apache-spark
metagraf
metaGraf is a opinionated specification for describing a software component and what its requirements are from the runtime environment. The mg command, turns metaGraf specifications into Kubernetes resources, supporting CI, CD and GitOps software delivery.
Stars: ✭ 15 (-94.68%)
Mutual labels:  pipeline
cli-property-manager
Use this Property Manager CLI to automate Akamai property changes and deployments across many environments.
Stars: ✭ 22 (-92.2%)
Mutual labels:  pipeline
HAR
Recognize one of six human activities such as standing, sitting, and walking using a Softmax Classifier trained on mobile phone sensor data.
Stars: ✭ 18 (-93.62%)
Mutual labels:  pipeline
pipelines-as-code
Pipelines as Code
Stars: ✭ 37 (-86.88%)
Mutual labels:  pipeline
pyspark-asyncactions
Asynchronous actions for PySpark
Stars: ✭ 30 (-89.36%)
Mutual labels:  apache-spark
latent-semantic-analysis
Pipeline for training LSA models using Scikit-Learn.
Stars: ✭ 20 (-92.91%)
Mutual labels:  pipeline
Spark Jupyter Aws
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (-8.16%)
Mutual labels:  apache-spark
coronavirus-stats
Automatically scrape data and statistics on Coronavirus to make them easily accessible in CSV format
Stars: ✭ 47 (-83.33%)
Mutual labels:  pipeline
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-91.13%)
Mutual labels:  pipeline
pywedge
Makes Interactive Chart Widget, Cleans raw data, Runs baseline models, Interactive hyperparameter tuning & tracking
Stars: ✭ 49 (-82.62%)
Mutual labels:  dataframe
Datavec
ETL Library for Machine Learning - data pipelines, data munging and wrangling
Stars: ✭ 272 (-3.55%)
Mutual labels:  pipeline
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-95.39%)
Mutual labels:  apache-spark
mmtf-workshop-2018
Structural Bioinformatics Training Workshop & Hackathon 2018
Stars: ✭ 50 (-82.27%)
Mutual labels:  apache-spark
elasticsearch-ingest-attachment-plugin-example
Example of how to use ElasticSearch ingest-attachment plugin using JavaScript
Stars: ✭ 19 (-93.26%)
Mutual labels:  pipeline
spark-structured-streaming-examples
Spark structured streaming examples with using of version 3.0.0
Stars: ✭ 23 (-91.84%)
Mutual labels:  apache-spark
germline-DNA
A BioWDL variantcalling pipeline for germline DNA data. Starting with FASTQ files to produce VCF files. Category:Multi-Sample
Stars: ✭ 21 (-92.55%)
Mutual labels:  pipeline
re-mote
Re-mote operations using SSH and Re-gent
Stars: ✭ 61 (-78.37%)
Mutual labels:  pipeline
skippa
SciKIt-learn Pipeline in PAndas
Stars: ✭ 33 (-88.3%)
Mutual labels:  pipeline
Parquet Dotnet
🏐 Apache Parquet for modern .NET
Stars: ✭ 276 (-2.13%)
Mutual labels:  apache-spark
etl
M-Lab ingestion pipeline
Stars: ✭ 15 (-94.68%)
Mutual labels:  pipeline
create-mithril-app
Sets up a mithril.js project with webpack
Stars: ✭ 20 (-92.91%)
Mutual labels:  pipeline
prose
A python framework to process FITS images. Built for Astronomy.
Stars: ✭ 21 (-92.55%)
Mutual labels:  pipeline
TACTIC-Handler
PySide based TACTIC client for maya, nuke, 3dsmax, houdini, etc
Stars: ✭ 67 (-76.24%)
Mutual labels:  pipeline
bifrost
A stream processing framework for high-throughput applications.
Stars: ✭ 48 (-82.98%)
Mutual labels:  pipeline
DNAscan
DNAscan is a fast and efficient bioinformatics pipeline that allows for the analysis of DNA Next Generation sequencing data, requiring very little computational effort and memory usage.
Stars: ✭ 36 (-87.23%)
Mutual labels:  pipeline
connected-component
Map Reduce Implementation of Connected Component on Apache Spark
Stars: ✭ 68 (-75.89%)
Mutual labels:  apache-spark
Nimdata
DataFrame API written in Nim, enabling fast out-of-core data processing
Stars: ✭ 261 (-7.45%)
Mutual labels:  dataframe
raccoon
Python DataFrame with fast insert and appends
Stars: ✭ 64 (-77.3%)
Mutual labels:  dataframe
jenkins-pipeline-gitflow-maven
Sample Maven project with a Jenkinsfile doing git-flow based release management
Stars: ✭ 47 (-83.33%)
Mutual labels:  pipeline
Computer-Architecture-Task-2
Riscv32 CPU Project
Stars: ✭ 43 (-84.75%)
Mutual labels:  pipeline
companion
This repository has been archived, currently maintained version is at https://github.com/iii-companion/companion
Stars: ✭ 21 (-92.55%)
Mutual labels:  pipeline
ploio
Safe, Reliable, and Fast Production Deployments for Kubernetes
Stars: ✭ 11 (-96.1%)
Mutual labels:  pipeline
RNASeq
RNASeq pipeline
Stars: ✭ 30 (-89.36%)
Mutual labels:  pipeline
sfpowerscripts
A build system for modular development in Salesforce, delivered as a sfdx plugin that can be implemented in any CI/CD system of choice
Stars: ✭ 121 (-57.09%)
Mutual labels:  pipeline
only-pipe
A non-intrusive Python pipeline.
Stars: ✭ 19 (-93.26%)
Mutual labels:  pipeline
concourse-ci-kube
Concoures CI Kube Deploment
Stars: ✭ 16 (-94.33%)
Mutual labels:  pipeline
1-60 of 573 similar projects