AirbyteAirbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+3773.23%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-37.8%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+398.43%)
Mlj.jlA Julia machine learning framework
Stars: ✭ 982 (+673.23%)
PloomberA convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
Stars: ✭ 221 (+74.02%)
Bodywork CoreDeploy machine learning projects developed in Python, to Kubernetes. Accelerated MLOps 🚀
Stars: ✭ 145 (+14.17%)
Soda SqlMetric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (+36.22%)
LightautomlLAMA - automatic model creation framework
Stars: ✭ 196 (+54.33%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-3.94%)
BatchflowBatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory.
Stars: ✭ 156 (+22.83%)
Just Dashboard📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+1089.76%)
TargetsFunction-oriented Make-like declarative workflows for R
Stars: ✭ 293 (+130.71%)
Applied Ml📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+13934.65%)
MlboxMLBox is a powerful Automated Machine Learning python library.
Stars: ✭ 1,199 (+844.09%)
AuptimizerAn automatic ML model optimization tool.
Stars: ✭ 166 (+30.71%)
SaynData processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (-37.8%)
SteppyLightweight, Python library for fast and reproducible experimentation 🔬
Stars: ✭ 119 (-6.3%)
Learn Something Every Day📝 A compilation of everything that I learn; Computer Science, Software Development, Engineering, Math, and Coding in General. Read the rendered results here ->
Stars: ✭ 362 (+185.04%)
PdpipeEasy pipelines for pandas DataFrames.
Stars: ✭ 590 (+364.57%)
Data Science On GcpSource code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+580.31%)
Steppy ToolkitCurated set of transformers that make your work with steppy faster and more effective 🔭
Stars: ✭ 21 (-83.46%)
D6t PythonAccelerate data science
Stars: ✭ 118 (-7.09%)
Chain.jlA Julia package for piping a value through a series of transformation expressions using a more convenient syntax than Julia's native piping functionality.
Stars: ✭ 118 (-7.09%)
GeniA Clojure dataframe library that runs on Spark
Stars: ✭ 152 (+19.69%)
Gspread PandasA package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (+77.95%)
AcceleratorThe Accelerator is a tool for fast and reproducible processing of large amounts of data.
Stars: ✭ 137 (+7.87%)
DrakeAn R-focused pipeline toolkit for reproducibility and high-performance computing
Stars: ✭ 1,301 (+924.41%)
BlurrData transformations for the ML era
Stars: ✭ 96 (-24.41%)
PrefectThe easiest way to automate your data
Stars: ✭ 7,956 (+6164.57%)
ButterfreeA tool for building feature stores.
Stars: ✭ 126 (-0.79%)
Automlpipeline.jlA package that makes it trivial to create and evaluate machine learning pipeline architectures.
Stars: ✭ 223 (+75.59%)
Drake ExamplesExample workflows for the drake R package
Stars: ✭ 57 (-55.12%)
SupersetApache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+33470.08%)
Aws Data WranglerPandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+1777.95%)
Keras ContribKeras community contributions
Stars: ✭ 1,532 (+1106.3%)
Rightmove webscraper.pyPython class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object
Stars: ✭ 125 (-1.57%)
WooeyA Django app that creates automatic web UIs for Python scripts.
Stars: ✭ 1,680 (+1222.83%)
Dat8General Assembly's 2015 Data Science course in Washington, DC
Stars: ✭ 1,516 (+1093.7%)
ModelchimpExperiment tracking for machine and deep learning projects
Stars: ✭ 121 (-4.72%)
TruvisoryThis project is meant to provide resources to users who want to access good LinkedIn posts which contains resources to learn any Technology, Design, Self-Branding, Motivation etc. You can visit project by:
Stars: ✭ 116 (-8.66%)
EuropaPuppet Container Registry
Stars: ✭ 114 (-10.24%)
Stock PredictionSmart Algorithms to predict buying and selling of stocks on the basis of Mutual Funds Analysis, Stock Trends Analysis and Prediction, Portfolio Risk Factor, Stock and Finance Market News Sentiment Analysis and Selling profit ratio. Project developed as a part of NSE-FutureTech-Hackathon 2018, Mumbai. Team : Semicolon
Stars: ✭ 125 (-1.57%)
SarekDetect germline or somatic variants from normal or tumour/normal whole-genome or targeted sequencing
Stars: ✭ 124 (-2.36%)
Pandas VideosJupyter notebook and datasets from the pandas Q&A video series
Stars: ✭ 1,716 (+1251.18%)
Seaborn TutorialThis repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.
Stars: ✭ 114 (-10.24%)
VarietyA schema analyzer for MongoDB
Stars: ✭ 1,592 (+1153.54%)
MlrMachine Learning in R
Stars: ✭ 1,542 (+1114.17%)
River🌊 Online machine learning in Python
Stars: ✭ 2,980 (+2246.46%)
Dbg PdsDeutsche Boerse's Financial Trading Public Data Set
Stars: ✭ 124 (-2.36%)
Auto ml[UNMAINTAINED] Automated machine learning for analytics & production
Stars: ✭ 1,559 (+1127.56%)
UgeneUGENE is free open-source cross-platform bioinformatics software
Stars: ✭ 112 (-11.81%)
Pythondatarepo for code published on pythondata.com
Stars: ✭ 113 (-11.02%)
Unix StreamTurn Java 8 Streams into Unix like pipelines
Stars: ✭ 119 (-6.3%)