NBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile y…

Stars: ✭ 102 (+500%)

Mutual labels: etl

eventkit

Event-driven data pipelines

Stars: ✭ 94 (+452.94%)

Mutual labels: pipeline

AirflowETL

Blog post on ETL pipelines with Airflow

Stars: ✭ 20 (+17.65%)

Mutual labels: etl

towhee

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

Stars: ✭ 821 (+4729.41%)

Mutual labels: pipeline

lncpipe

UNDER DEVELOPMENT--- Analysis of long non-coding RNAs from RNA-seq datasets

Stars: ✭ 24 (+41.18%)

Mutual labels: pipeline

krsh

A declarative KubeFlow Management Tool

Stars: ✭ 127 (+647.06%)

Mutual labels: pipeline

isarn-sketches-spark

Routines and data structures for using isarn-sketches idiomatically in Apache Spark

Stars: ✭ 28 (+64.71%)

Mutual labels: pyspark

GoEmotions-pytorch

Pytorch Implementation of GoEmotions 😍😢😱

Stars: ✭ 95 (+458.82%)

Mutual labels: pipeline

spark3D

Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …

Stars: ✭ 23 (+35.29%)

Mutual labels: pyspark

ECG analysis

No description or website provided.

Stars: ✭ 32 (+88.24%)

Mutual labels: data-processing

makepipe

Tools for constructing simple make-like pipelines in R.

Stars: ✭ 23 (+35.29%)

Mutual labels: pipeline

wrangle

A data transformation package for deep learning with Autonomio, Keras and TensorFlow.

Stars: ✭ 15 (-11.76%)

Mutual labels: etl

pipe

Functional Pipeline in Go

Stars: ✭ 30 (+76.47%)

Mutual labels: pipeline

CVparser

CVparser is software for parsing or extracting data out of CV/resumes.

Stars: ✭ 28 (+64.71%)

Mutual labels: etl

optimus

🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

Stars: ✭ 1,351 (+7847.06%)

Mutual labels: pyspark

traceml

Engine for ML/Data tracking, visualization, dashboards, and model UI for Polyaxon.

Stars: ✭ 445 (+2517.65%)

Mutual labels: data-processing

nextNEOpi

nextNEOpi: a comprehensive pipeline for computational neoantigen prediction

Stars: ✭ 42 (+147.06%)

Mutual labels: pipeline

uptasticsearch

An Elasticsearch client tailored to data science workflows.

Stars: ✭ 47 (+176.47%)

Mutual labels: etl

targets-tutorial

Short course on the targets R package

Stars: ✭ 87 (+411.76%)

Mutual labels: pipeline

etlflow

EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.

Stars: ✭ 38 (+123.53%)

Mutual labels: etl

assume-role-arn

🤖🎩assume-role-arn allows you to easily assume an AWS IAM role in your CI/CD pipelines, without worrying about external dependencies.

Stars: ✭ 54 (+217.65%)

Mutual labels: pipeline

django-calaccess-raw-data

A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database

Stars: ✭ 61 (+258.82%)

Mutual labels: etl

frizzle

The magic message bus

Stars: ✭ 14 (-17.65%)

Mutual labels: pipeline

jobAnalytics and search

JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.

Stars: ✭ 25 (+47.06%)

Mutual labels: pyspark

thain

Thain is a distributed flow schedule platform.

Stars: ✭ 81 (+376.47%)

Mutual labels: etl

modelscript

REPO MOVED TO https://github.com/repetere/jsonstack-data - Data Science and Machine learning in JavaScript

Stars: ✭ 40 (+135.29%)

Mutual labels: data-preprocessing

html-pipeline

HTML processing filters and utilities in Go version

Stars: ✭ 18 (+5.88%)

Mutual labels: pipeline

Docker Android Build Box

An optimized docker image includes Android, Kotlin, Flutter sdk.

Stars: ✭ 245 (+1341.18%)

Mutual labels: pipeline

redundans

Redundans is a pipeline that assists an assembly of heterozygous/polymorphic genomes.

Stars: ✭ 90 (+429.41%)

Mutual labels: pipeline

Automlpipeline.jl

A package that makes it trivial to create and evaluate machine learning pipeline architectures.

Stars: ✭ 223 (+1211.76%)

Mutual labels: pipeline

get phylomarkers

A pipeline to select optimal markers for microbial phylogenomics and species tree estimation using coalescent and concatenation approaches

Stars: ✭ 34 (+100%)

Mutual labels: pipeline

Redispipe

High-throughput Redis client for Go with implicit pipelining

Stars: ✭ 215 (+1164.71%)

Mutual labels: pipeline

NVTabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

Stars: ✭ 797 (+4588.24%)

Mutual labels: preprocessing

jenkins-terraform-pipeline

create a jenkins pipeline which uses terraform to manage AWS resources

Stars: ✭ 17 (+0%)

Mutual labels: pipeline

csvplus

csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.

Stars: ✭ 67 (+294.12%)

Mutual labels: etl

DataX-src

DataX 是异构数据广泛使用的离线数据同步工具/平台，实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、OTS、ODPS 等各种异构数据源之间高效的数据同步功能。

Stars: ✭ 21 (+23.53%)

Mutual labels: etl

langx-java

Java tools, helper, common utilities. A replacement of guava, apache-commons, hutool

Stars: ✭ 50 (+194.12%)

Mutual labels: pipeline

google classroom

Google Classroom Data Pipeline

Stars: ✭ 17 (+0%)

Mutual labels: pipeline

cubetl

CubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)

Stars: ✭ 21 (+23.53%)

Mutual labels: etl

GeneLab Data Processing

No description or website provided.

Stars: ✭ 32 (+88.24%)

Mutual labels: pipeline

hic

Analysis of Chromosome Conformation Capture data (Hi-C)

Stars: ✭ 45 (+164.71%)

Mutual labels: pipeline

chronicle-etl

📜 A CLI toolkit for extracting and working with your digital history

Stars: ✭ 78 (+358.82%)

Mutual labels: etl

dmriprep

dMRIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data. The transparent workflow dispenses of manual intervention, thereby ensuring the reproducibility of the results.

Stars: ✭ 55 (+223.53%)

Mutual labels: preprocessing

241-300 of 744 similar projects