PrefectThe easiest way to automate your data
Stars: ✭ 7,956 (+3500%)
VdsVerteego Data Suite
Stars: ✭ 9 (-95.93%)
PolyaxonMachine Learning Platform for Kubernetes (MLOps tools for experimentation and automation)
Stars: ✭ 2,966 (+1242.08%)
Gspread PandasA package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (+2.26%)
LuxPython API for Intelligent Visual Data Discovery
Stars: ✭ 787 (+256.11%)
Crime AnalysisAssociation Rule Mining from Spatial Data for Crime Analysis
Stars: ✭ 20 (-90.95%)
FlyteAccelerate your ML and Data workflows to production. Flyte is a production grade orchestration system for your Data and ML workloads. It has been battle tested at Lyft, Spotify, freenome and others and truly open-source.
Stars: ✭ 1,242 (+461.99%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-64.25%)
Sci PypeA Machine Learning API with native redis caching and export + import using S3. Analyze entire datasets using an API for building, training, testing, analyzing, extracting, importing, and archiving. This repository can run from a docker container or from the repository.
Stars: ✭ 90 (-59.28%)
Applied Ml📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+7965.16%)
ButterfreeA tool for building feature stores.
Stars: ✭ 126 (-42.99%)
DashAnalytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required.
Stars: ✭ 15,592 (+6955.2%)
Datastream.ioAn open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Stars: ✭ 814 (+268.33%)
Pandas ProfilingCreate HTML profiling reports from pandas DataFrame objects
Stars: ✭ 8,329 (+3668.78%)
Docker ImagesOut-of-box Data Science / AI platform | AI/数据科学的瑞士军刀
Stars: ✭ 25 (-88.69%)
Drake ExamplesExample workflows for the drake R package
Stars: ✭ 57 (-74.21%)
Covid19 DashboardA site that displays up to date COVID-19 stats, powered by fastpages.
Stars: ✭ 1,212 (+448.42%)
DrakeAn R-focused pipeline toolkit for reproducibility and high-performance computing
Stars: ✭ 1,301 (+488.69%)
Python TrainingPython training for business analysts and traders
Stars: ✭ 972 (+339.82%)
D6t PythonAccelerate data science
Stars: ✭ 118 (-46.61%)
Spark AlchemyCollection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
Stars: ✭ 122 (-44.8%)
PipelinexPipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
Stars: ✭ 127 (-42.53%)
Ml Hub🧰 Multi-user development platform for machine learning teams. Simple to setup within minutes.
Stars: ✭ 148 (-33.03%)
Ml Workspace🛠 All-in-one web-based IDE specialized for machine learning and data science.
Stars: ✭ 2,337 (+957.47%)
AuptimizerAn automatic ML model optimization tool.
Stars: ✭ 166 (-24.89%)
Bowtie Create a dashboard with python!
Stars: ✭ 724 (+227.6%)
Cookbook 2ndIPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
Stars: ✭ 704 (+218.55%)
OrchestA new kind of IDE for Data Science.
Stars: ✭ 694 (+214.03%)
AirbyteAirbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+2125.79%)
Lets Plot KotlinKotlin API for Lets-Plot - an open-source plotting library for statistical data.
Stars: ✭ 181 (-18.1%)
Data Science On GcpSource code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+290.95%)
Arcgis Python ApiDocumentation and samples for ArcGIS API for Python
Stars: ✭ 954 (+331.67%)
Nteract📘 The interactive computing suite for you! ✨
Stars: ✭ 5,713 (+2485.07%)
Ppd599USC urban data science course series with Python and Jupyter
Stars: ✭ 1,062 (+380.54%)
Interactive.NET Interactive takes the power of .NET and embeds it into your interactive experiences. Share code, explore data, write, and learn across your apps in ways you couldn't before.
Stars: ✭ 978 (+342.53%)
SaynData processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (-64.25%)
MachinelearningcourseA collection of notebooks of my Machine Learning class written in python 3
Stars: ✭ 35 (-84.16%)
JupytemplateTemplates for jupyter notebooks
Stars: ✭ 85 (-61.54%)
CytoflowA Python toolbox for quantitative, reproducible flow cytometry analysis
Stars: ✭ 90 (-59.28%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+186.43%)
Just Dashboard📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+583.71%)
Responsible Ai WidgetsThis project provides responsible AI user interfaces for Fairlearn, interpret-community, and Error Analysis, as well as foundational building blocks that they rely on.
Stars: ✭ 107 (-51.58%)
Aws Data WranglerPandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+979.19%)
Spark R Notebooks R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 109 (-50.68%)
AcceleratorThe Accelerator is a tool for fast and reproducible processing of large amounts of data.
Stars: ✭ 137 (-38.01%)
Beyond Jupyter🐍💻📊 All material from the PyCon.DE 2018 Talk "Beyond Jupyter Notebooks - Building your own data science platform with Python & Docker" (incl. Slides, Video, Udemy MOOC & other References)
Stars: ✭ 135 (-38.91%)
Data Science Stack Cookiecutter🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)
Stars: ✭ 153 (-30.77%)
SupersetApache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+19191.4%)
LearnpythonforresearchThis repository provides everything you need to get started with Python for (social science) research.
Stars: ✭ 163 (-26.24%)
PrimehubA toil-free multi-tenancy machine learning platform in your Kubernetes cluster
Stars: ✭ 160 (-27.6%)
Soda SqlMetric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (-21.72%)
GeniA Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-31.22%)
FastpagesAn easy to use blogging platform, with enhanced support for Jupyter Notebooks.
Stars: ✭ 2,888 (+1206.79%)
Cookbook 2nd CodeCode of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]
Stars: ✭ 541 (+144.8%)
Fastai2Temporary home for fastai v2 while it's being developed
Stars: ✭ 630 (+185.07%)
Jupyterlab Prodigy🧬 A JupyterLab extension for annotating data with Prodigy
Stars: ✭ 97 (-56.11%)
BatchflowBatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory.
Stars: ✭ 156 (-29.41%)
PlynxPLynx is a domain agnostic platform for managing reproducible experiments and data-oriented workflows.
Stars: ✭ 192 (-13.12%)