Bio embeddingsGet protein embeddings from protein sequences
Stars: ✭ 86 (-43.79%)
Unix StreamTurn Java 8 Streams into Unix like pipelines
Stars: ✭ 119 (-22.22%)
AtacseqATAC-seq peak-calling, QC and differential analysis pipeline
Stars: ✭ 72 (-52.94%)
HookahA cross-platform tool for data pipelines.
Stars: ✭ 83 (-45.75%)
Bodywork CoreDeploy machine learning projects developed in Python, to Kubernetes. Accelerated MLOps 🚀
Stars: ✭ 145 (-5.23%)
KgtkKnowledge Graph Toolkit
Stars: ✭ 81 (-47.06%)
SteppyLightweight, Python library for fast and reproducible experimentation 🔬
Stars: ✭ 119 (-22.22%)
Scrapy demoall kinds of scrapy demo
Stars: ✭ 128 (-16.34%)
Reddit DetectivePlay detective on Reddit: Discover political disinformation campaigns, secret influencers and more
Stars: ✭ 129 (-15.69%)
MachineMachine is a workflow/pipeline library for processing data
Stars: ✭ 78 (-49.02%)
Chain.jlA Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.
Stars: ✭ 118 (-22.88%)
Rangelessc++ LINQ -like library of higher-order functions for data manipulation
Stars: ✭ 148 (-3.27%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+681.05%)
LastbackendSystem for containerized apps management. From build to scaling.
Stars: ✭ 1,536 (+903.92%)
Locopylocopy: Loading/Unloading to Redshift and Snowflake using Python.
Stars: ✭ 73 (-52.29%)
Etl.netMass processing data with a complete ETL for .net developers
Stars: ✭ 129 (-15.69%)
TransporterSync data between persistence engines, like ETL only not stodgy
Stars: ✭ 1,175 (+667.97%)
EuropaPuppet Container Registry
Stars: ✭ 114 (-25.49%)
GlobalbioticinteractionsGlobal Biotic Interactions provides access to existing species interaction datasets
Stars: ✭ 71 (-53.59%)
UgeneUGENE is free open-source cross-platform bioinformatics software
Stars: ✭ 112 (-26.8%)
Etl with pythonETL with Python - Taught at DWH course 2017 (TAU)
Stars: ✭ 68 (-55.56%)
TdpThe Darkest Pipeline - Multithreaded pipelines for modern C++
Stars: ✭ 67 (-56.21%)
SqueezemetaA complete pipeline for metagenomic analysis
Stars: ✭ 128 (-16.34%)
SteeloverseerA file watcher and development tool.
Stars: ✭ 110 (-28.1%)
Irapintegrated RNA-seq Analysis Pipeline
Stars: ✭ 65 (-57.52%)
WaterdropProduction Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+1113.07%)
DiscreetlyETLy is an add-on dashboard service on top of Apache Airflow.
Stars: ✭ 60 (-60.78%)
Scrapy S3pipelineScrapy pipeline to store chunked items into Amazon S3 or Google Cloud Storage bucket.
Stars: ✭ 57 (-62.75%)
MotorwayCloud ready pure-python streaming data pipeline library
Stars: ✭ 150 (-1.96%)
Omniparseromniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
Stars: ✭ 148 (-3.27%)
Eel SdkBig Data Toolkit for the JVM
Stars: ✭ 140 (-8.5%)
PipelinexPipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
Stars: ✭ 127 (-16.99%)
Aws Ecs AirflowRun Airflow in AWS ECS(Elastic Container Service) using Fargate tasks
Stars: ✭ 107 (-30.07%)
Drake ExamplesExample workflows for the drake R package
Stars: ✭ 57 (-62.75%)
TormesMaking whole bacterial genome sequencing data analysis easy
Stars: ✭ 56 (-63.4%)
Gitlab Dashboard📺 TV dashboard for a global view on Gitlab Pipelines
Stars: ✭ 107 (-30.07%)
Stream Splicerstreaming pipeline with a mutable configuration
Stars: ✭ 52 (-66.01%)
Dawn🌅 Dawn is a lightweight task management and build tool for front-end and nodejs.
Stars: ✭ 1,057 (+590.85%)
Kafka Connectequivalent to kafka-connect 🔧 for nodejs ✨🐢🚀✨
Stars: ✭ 102 (-33.33%)
SalmonteSalmonTE is an ultra-Fast and Scalable Quantification Pipeline of Transpose Element (TE) Abundances
Stars: ✭ 49 (-67.97%)
Kiba PlusKiba enhancement for Ruby ETL.
Stars: ✭ 47 (-69.28%)
Csv2dbThe CSV to database command line loader
Stars: ✭ 102 (-33.33%)
Bentools EtlPHP ETL (Extract / Transform / Load) library with SOLID principles + almost no dependency.
Stars: ✭ 45 (-70.59%)
SnsAnalysis pipelines for sequencing data
Stars: ✭ 43 (-71.9%)
Go spider[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.
Stars: ✭ 1,745 (+1040.52%)
SemsegpipelineA simpler way of reading and augmenting image segmentation data into TensorFlow
Stars: ✭ 126 (-17.65%)
DotnettencyMutlitenancy for dotnet applications
Stars: ✭ 100 (-34.64%)
Ensembl HiveEnsEMBL Hive - a system for creating and running pipelines on a distributed compute resource
Stars: ✭ 44 (-71.24%)
OdČeská otevřená data
Stars: ✭ 99 (-35.29%)
Jenkins OsGroovy pipeline jobs that build and test Container Linux with Jenkins
Stars: ✭ 43 (-71.9%)
Jenkins Workflowcontains handy groovy workflow-libs scripts
Stars: ✭ 41 (-73.2%)
SupraSUPRA: Software Defined Ultrasound Processing for Real-Time Applications - An Open Source 2D and 3D Pipeline from Beamforming to B-Mode
Stars: ✭ 96 (-37.25%)
Ether sqlA python library to push ethereum blockchain data into an sql database.
Stars: ✭ 41 (-73.2%)
AlchemistA realtime ETL engine
Stars: ✭ 40 (-73.86%)