All Projects → sparklanes → Similar Projects or Alternatives

744 Open source projects that are alternatives of or similar to sparklanes

cmip6 preprocessing
Analysis ready CMIP6 data in python the easy way with pangeo tools.
Stars: ✭ 126 (+641.18%)
Mutual labels:  preprocessing
bookmarks
A PySide2 based file and asset manager for animation and CG productions.
Stars: ✭ 33 (+94.12%)
Mutual labels:  pipeline
eidos-audition
Collection of auditory models.
Stars: ✭ 25 (+47.06%)
Mutual labels:  pipeline
rna-seq-snakemake
Snakemake based pipeline for RNA-Seq analysis
Stars: ✭ 29 (+70.59%)
Mutual labels:  pipeline
timit-preprocessor
Extract mfcc vectors and phones from TIMIT dataset
Stars: ✭ 14 (-17.65%)
Mutual labels:  data-preprocessing
perke
A keyphrase extractor for Persian
Stars: ✭ 60 (+252.94%)
Mutual labels:  data-processing
wikirepo
Python based Wikidata framework for easy dataframe extraction
Stars: ✭ 33 (+94.12%)
Mutual labels:  etl
krawler
A minimalist (geospatial) ETL
Stars: ✭ 51 (+200%)
Mutual labels:  etl
nanoflow
🔬 De novo assembly of nanopore reads using nextflow
Stars: ✭ 20 (+17.65%)
Mutual labels:  pipeline
nodejs-docker-example
An example of how to run a Node.js project in Docker in a Buildkite pipeline
Stars: ✭ 39 (+129.41%)
Mutual labels:  pipeline
ember-pipeline
Railway oriented programming in Ember
Stars: ✭ 17 (+0%)
Mutual labels:  pipeline
hive-metastore-client
A client for connecting and running DDLs on hive metastore.
Stars: ✭ 37 (+117.65%)
Mutual labels:  etl
lightflow
A lightweight, distributed workflow system
Stars: ✭ 67 (+294.12%)
Mutual labels:  pipeline
jgit-spark-connector
jgit-spark-connector is a library for running scalable data retrieval pipelines that process any number of Git repositories for source code analysis.
Stars: ✭ 71 (+317.65%)
Mutual labels:  pyspark
modelbox
A high performance, high expansion, easy to use framework for AI application. 为AI应用的开发者提供一套统一的高性能、易用的编程框架,快速基于AI全栈服务、开发跨端边云的AI行业应用。
Stars: ✭ 48 (+182.35%)
Mutual labels:  pipeline
nemesyst
Generalised and highly customisable, hybrid-parallelism, database based, deep learning framework.
Stars: ✭ 17 (+0%)
Mutual labels:  pipeline
oic-options-chains
ETL for OIC Options Chains
Stars: ✭ 22 (+29.41%)
Mutual labels:  etl
NBi
NBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile y…
Stars: ✭ 102 (+500%)
Mutual labels:  etl
eventkit
Event-driven data pipelines
Stars: ✭ 94 (+452.94%)
Mutual labels:  pipeline
AirflowETL
Blog post on ETL pipelines with Airflow
Stars: ✭ 20 (+17.65%)
Mutual labels:  etl
towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Stars: ✭ 821 (+4729.41%)
Mutual labels:  pipeline
lncpipe
UNDER DEVELOPMENT--- Analysis of long non-coding RNAs from RNA-seq datasets
Stars: ✭ 24 (+41.18%)
Mutual labels:  pipeline
krsh
A declarative KubeFlow Management Tool
Stars: ✭ 127 (+647.06%)
Mutual labels:  pipeline
isarn-sketches-spark
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Stars: ✭ 28 (+64.71%)
Mutual labels:  pyspark
GoEmotions-pytorch
Pytorch Implementation of GoEmotions 😍😢😱
Stars: ✭ 95 (+458.82%)
Mutual labels:  pipeline
spark3D
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …
Stars: ✭ 23 (+35.29%)
Mutual labels:  pyspark
ECG analysis
No description or website provided.
Stars: ✭ 32 (+88.24%)
Mutual labels:  data-processing
makepipe
Tools for constructing simple make-like pipelines in R.
Stars: ✭ 23 (+35.29%)
Mutual labels:  pipeline
wrangle
A data transformation package for deep learning with Autonomio, Keras and TensorFlow.
Stars: ✭ 15 (-11.76%)
Mutual labels:  etl
pipe
Functional Pipeline in Go
Stars: ✭ 30 (+76.47%)
Mutual labels:  pipeline
CVparser
CVparser is software for parsing or extracting data out of CV/resumes.
Stars: ✭ 28 (+64.71%)
Mutual labels:  etl
optimus
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+7847.06%)
Mutual labels:  pyspark
traceml
Engine for ML/Data tracking, visualization, dashboards, and model UI for Polyaxon.
Stars: ✭ 445 (+2517.65%)
Mutual labels:  data-processing
nextNEOpi
nextNEOpi: a comprehensive pipeline for computational neoantigen prediction
Stars: ✭ 42 (+147.06%)
Mutual labels:  pipeline
uptasticsearch
An Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (+176.47%)
Mutual labels:  etl
targets-tutorial
Short course on the targets R package
Stars: ✭ 87 (+411.76%)
Mutual labels:  pipeline
etlflow
EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
Stars: ✭ 38 (+123.53%)
Mutual labels:  etl
assume-role-arn
🤖🎩assume-role-arn allows you to easily assume an AWS IAM role in your CI/CD pipelines, without worrying about external dependencies.
Stars: ✭ 54 (+217.65%)
Mutual labels:  pipeline
django-calaccess-raw-data
A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database
Stars: ✭ 61 (+258.82%)
Mutual labels:  etl
frizzle
The magic message bus
Stars: ✭ 14 (-17.65%)
Mutual labels:  pipeline
jobAnalytics and search
JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.
Stars: ✭ 25 (+47.06%)
Mutual labels:  pyspark
thain
Thain is a distributed flow schedule platform.
Stars: ✭ 81 (+376.47%)
Mutual labels:  etl
modelscript
REPO MOVED TO https://github.com/repetere/jsonstack-data - Data Science and Machine learning in JavaScript
Stars: ✭ 40 (+135.29%)
Mutual labels:  data-preprocessing
html-pipeline
HTML processing filters and utilities in Go version
Stars: ✭ 18 (+5.88%)
Mutual labels:  pipeline
Docker Android Build Box
An optimized docker image includes Android, Kotlin, Flutter sdk.
Stars: ✭ 245 (+1341.18%)
Mutual labels:  pipeline
redundans
Redundans is a pipeline that assists an assembly of heterozygous/polymorphic genomes.
Stars: ✭ 90 (+429.41%)
Mutual labels:  pipeline
Automlpipeline.jl
A package that makes it trivial to create and evaluate machine learning pipeline architectures.
Stars: ✭ 223 (+1211.76%)
Mutual labels:  pipeline
get phylomarkers
A pipeline to select optimal markers for microbial phylogenomics and species tree estimation using coalescent and concatenation approaches
Stars: ✭ 34 (+100%)
Mutual labels:  pipeline
Redispipe
High-throughput Redis client for Go with implicit pipelining
Stars: ✭ 215 (+1164.71%)
Mutual labels:  pipeline
NVTabular
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
Stars: ✭ 797 (+4588.24%)
Mutual labels:  preprocessing
jenkins-terraform-pipeline
create a jenkins pipeline which uses terraform to manage AWS resources
Stars: ✭ 17 (+0%)
Mutual labels:  pipeline
csvplus
csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.
Stars: ✭ 67 (+294.12%)
Mutual labels:  etl
DataX-src
DataX 是异构数据广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、OTS、ODPS 等各种异构数据源之间高效的数据同步功能。
Stars: ✭ 21 (+23.53%)
Mutual labels:  etl
langx-java
Java tools, helper, common utilities. A replacement of guava, apache-commons, hutool
Stars: ✭ 50 (+194.12%)
Mutual labels:  pipeline
google classroom
Google Classroom Data Pipeline
Stars: ✭ 17 (+0%)
Mutual labels:  pipeline
cubetl
CubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)
Stars: ✭ 21 (+23.53%)
Mutual labels:  etl
GeneLab Data Processing
No description or website provided.
Stars: ✭ 32 (+88.24%)
Mutual labels:  pipeline
hic
Analysis of Chromosome Conformation Capture data (Hi-C)
Stars: ✭ 45 (+164.71%)
Mutual labels:  pipeline
chronicle-etl
📜 A CLI toolkit for extracting and working with your digital history
Stars: ✭ 78 (+358.82%)
Mutual labels:  etl
dmriprep
dMRIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data. The transparent workflow dispenses of manual intervention, thereby ensuring the reproducibility of the results.
Stars: ✭ 55 (+223.53%)
Mutual labels:  preprocessing
241-300 of 744 similar projects