All Projects → Pipelinex → Similar Projects or Alternatives

1280 Open source projects that are alternatives of or similar to Pipelinex

Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+3773.23%)
Great expectations
Always know what to expect from your data.
Stars: ✭ 5,808 (+4473.23%)
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-37.8%)
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+398.43%)
Mutual labels:  data-science, data-engineering
Mlj.jl
A Julia machine learning framework
Stars: ✭ 982 (+673.23%)
Mutual labels:  data-science, pipeline
Ploomber
A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
Stars: ✭ 221 (+74.02%)
Mutual labels:  data-science, data-engineering
Open Solution Toxic Comments
Open solution to the Toxic Comment Classification Challenge
Stars: ✭ 154 (+21.26%)
Mutual labels:  data-science, pipeline
Bodywork Core
Deploy machine learning projects developed in Python, to Kubernetes. Accelerated MLOps 🚀
Stars: ✭ 145 (+14.17%)
Mutual labels:  data-science, pipeline
Soda Sql
Metric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (+36.22%)
Mutual labels:  data-science, data-engineering
Lightautoml
LAMA - automatic model creation framework
Stars: ✭ 196 (+54.33%)
Mutual labels:  data-science, pipeline
Open Solution Salt Identification
Open solution to the TGS Salt Identification Challenge
Stars: ✭ 124 (-2.36%)
Mutual labels:  data-science, pipeline
Spark Alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-3.94%)
Mutual labels:  data-science, data-engineering
Batchflow
BatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory.
Stars: ✭ 156 (+22.83%)
Mutual labels:  data-science, pipeline
Just Dashboard
📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+1089.76%)
Mutual labels:  data-science, data-engineering
Targets
Function-oriented Make-like declarative workflows for R
Stars: ✭ 293 (+130.71%)
Mutual labels:  data-science, pipeline
Applied Ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+13934.65%)
Mutual labels:  data-science, data-engineering
Mlbox
MLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+844.09%)
Mutual labels:  data-science, pipeline
Auptimizer
An automatic ML model optimization tool.
Stars: ✭ 166 (+30.71%)
Mutual labels:  data-science, data-engineering
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (-37.8%)
Mutual labels:  data-science, data-engineering
Steppy
Lightweight, Python library for fast and reproducible experimentation 🔬
Stars: ✭ 119 (-6.3%)
Mutual labels:  data-science, pipeline
Learn Something Every Day
📝 A compilation of everything that I learn; Computer Science, Software Development, Engineering, Math, and Coding in General. Read the rendered results here ->
Stars: ✭ 362 (+185.04%)
Mutual labels:  data-science, data-engineering
Pdpipe
Easy pipelines for pandas DataFrames.
Stars: ✭ 590 (+364.57%)
Mutual labels:  data-science, pipeline
Data Science On Gcp
Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+580.31%)
Mutual labels:  data-science, data-engineering
Steppy Toolkit
Curated set of transformers that make your work with steppy faster and more effective 🔭
Stars: ✭ 21 (-83.46%)
Mutual labels:  data-science, pipeline
D6t Python
Accelerate data science
Stars: ✭ 118 (-7.09%)
Mutual labels:  data-science, data-engineering
Chain.jl
A Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.
Stars: ✭ 118 (-7.09%)
Mutual labels:  data-science, pipeline
Geni
A Clojure dataframe library that runs on Spark
Stars: ✭ 152 (+19.69%)
Mutual labels:  data-science, data-engineering
Gspread Pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (+77.95%)
Mutual labels:  data-science, data-engineering
Accelerator
The Accelerator is a tool for fast and reproducible processing of large amounts of data.
Stars: ✭ 137 (+7.87%)
Mutual labels:  data-science, data-engineering
Drake
An R-focused pipeline toolkit for reproducibility and high-performance computing
Stars: ✭ 1,301 (+924.41%)
Mutual labels:  data-science, pipeline
Blurr
Data transformations for the ML era
Stars: ✭ 96 (-24.41%)
Mutual labels:  data-science, pipeline
Open Solution Mapping Challenge
Open solution to the Mapping Challenge 🌎
Stars: ✭ 291 (+129.13%)
Mutual labels:  data-science, pipeline
Prefect
The easiest way to automate your data
Stars: ✭ 7,956 (+6164.57%)
Mutual labels:  data-science, data-engineering
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (-0.79%)
Mutual labels:  data-science, data-engineering
Automlpipeline.jl
A package that makes it trivial to create and evaluate machine learning pipeline architectures.
Stars: ✭ 223 (+75.59%)
Mutual labels:  data-science, pipeline
Drake Examples
Example workflows for the drake R package
Stars: ✭ 57 (-55.12%)
Mutual labels:  data-science, pipeline
Superset
Apache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+33470.08%)
Mutual labels:  data-science, data-engineering
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+1777.95%)
Mutual labels:  data-science, data-engineering
Ml Email Clustering
Email clustering with machine learning
Stars: ✭ 116 (-8.66%)
Mutual labels:  data-science
Learn Machine Learning
Learn to Build a Machine Learning Application from Top Articles
Stars: ✭ 116 (-8.66%)
Mutual labels:  data-science
Keras Contrib
Keras community contributions
Stars: ✭ 1,532 (+1106.3%)
Mutual labels:  data-science
Rightmove webscraper.py
Python class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object
Stars: ✭ 125 (-1.57%)
Mutual labels:  data-science
Wooey
A Django app that creates automatic web UIs for Python scripts.
Stars: ✭ 1,680 (+1222.83%)
Mutual labels:  data-science
Dat8
General Assembly's 2015 Data Science course in Washington, DC
Stars: ✭ 1,516 (+1093.7%)
Mutual labels:  data-science
Modelchimp
Experiment tracking for machine and deep learning projects
Stars: ✭ 121 (-4.72%)
Mutual labels:  data-science
Truvisory
This project is meant to provide resources to users who want to access good LinkedIn posts which contains resources to learn any Technology, Design, Self-Branding, Motivation etc. You can visit project by:
Stars: ✭ 116 (-8.66%)
Mutual labels:  data-science
Europa
Puppet Container Registry
Stars: ✭ 114 (-10.24%)
Mutual labels:  pipeline
Stock Prediction
Smart Algorithms to predict buying and selling of stocks on the basis of Mutual Funds Analysis, Stock Trends Analysis and Prediction, Portfolio Risk Factor, Stock and Finance Market News Sentiment Analysis and Selling profit ratio. Project developed as a part of NSE-FutureTech-Hackathon 2018, Mumbai. Team : Semicolon
Stars: ✭ 125 (-1.57%)
Mutual labels:  data-science
Sarek
Detect germline or somatic variants from normal or tumour/normal whole-genome or targeted sequencing
Stars: ✭ 124 (-2.36%)
Mutual labels:  pipeline
Pandas Videos
Jupyter notebook and datasets from the pandas Q&A video series
Stars: ✭ 1,716 (+1251.18%)
Mutual labels:  data-science
Scipy 2017 Cython Tutorial
Material for the SciPy 2017 Cython tutorial
Stars: ✭ 114 (-10.24%)
Mutual labels:  data-science
Seaborn Tutorial
This repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.
Stars: ✭ 114 (-10.24%)
Mutual labels:  data-science
Variety
A schema analyzer for MongoDB
Stars: ✭ 1,592 (+1153.54%)
Mutual labels:  data-science
Mlr
Machine Learning in R
Stars: ✭ 1,542 (+1114.17%)
Mutual labels:  data-science
River
🌊 Online machine learning in Python
Stars: ✭ 2,980 (+2246.46%)
Mutual labels:  data-science
Dbg Pds
Deutsche Boerse's Financial Trading Public Data Set
Stars: ✭ 124 (-2.36%)
Mutual labels:  data-science
Auto ml
[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (+1127.56%)
Mutual labels:  data-science
Ugene
UGENE is free open-source cross-platform bioinformatics software
Stars: ✭ 112 (-11.81%)
Mutual labels:  pipeline
Pythondata
repo for code published on pythondata.com
Stars: ✭ 113 (-11.02%)
Mutual labels:  data-science
Unix Stream
Turn Java 8 Streams into Unix like pipelines
Stars: ✭ 119 (-6.3%)
Mutual labels:  pipeline
1-60 of 1280 similar projects