DagsterAn orchestration platform for the development, production, and observation of data assets.
Stars: ✭ 4,099 (+17721.74%)
Mexican Government ReportText Mining on the 2019 Mexican Government Report, covering from extracting text from a PDF file to plotting the results.
Stars: ✭ 473 (+1956.52%)
Awesome H2oA curated list of research, applications and projects built using the H2O Machine Learning platform
Stars: ✭ 293 (+1173.91%)
Bowtie Create a dashboard with python!
Stars: ✭ 724 (+3047.83%)
Cookiecutter Data ScienceA logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
Stars: ✭ 5,271 (+22817.39%)
Issue Label BotCode For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"
Stars: ✭ 292 (+1169.57%)
Data Science CareerCareer Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Stars: ✭ 630 (+2639.13%)
Pandas JsPandas in JavaScript for data analysis and visualization
Stars: ✭ 389 (+1591.3%)
Python ArticlesMonthly Series - Top 10 Python Articles
Stars: ✭ 288 (+1152.17%)
LambdaschooldatascienceCompleted assignments and coding challenges from the Lambda School Data Science program.
Stars: ✭ 22 (-4.35%)
PynamicalPynamical is a Python package for modeling and visualizing discrete nonlinear dynamical systems, chaos, and fractals.
Stars: ✭ 458 (+1891.3%)
Uncertainty BaselinesHigh-quality implementations of standard and SOTA methods on a variety of tasks.
Stars: ✭ 278 (+1108.7%)
Boltons🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.
Stars: ✭ 5,671 (+24556.52%)
PoutyneA simplified framework and utilities for PyTorch
Stars: ✭ 458 (+1891.3%)
BaikalA graph-based functional API for building complex scikit-learn pipelines.
Stars: ✭ 573 (+2391.3%)
Production Data ScienceProduction Data Science: a workflow for collaborative data science aimed at production
Stars: ✭ 388 (+1586.96%)
VaexOut-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualize and explore big tabular data at a billion rows per second 🚀
Stars: ✭ 6,793 (+29434.78%)
SealionThe first machine learning framework that encourages learning ML concepts instead of memorizing class functions.
Stars: ✭ 278 (+1108.7%)
PyodA Python Toolbox for Scalable Outlier Detection (Anomaly Detection)
Stars: ✭ 5,083 (+22000%)
LazydataLazydata: Scalable data dependencies for Python projects
Stars: ✭ 627 (+2626.09%)
UrsUniversal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
Stars: ✭ 275 (+1095.65%)
Serenata De Amor🕵 Artificial Intelligence for social control of public administration
Stars: ✭ 4,367 (+18886.96%)
Open Quant Live BookAn open source, hands-on and fully reproducible book in quantitative finance, data science and econophysics. Join us and help Make Wall Street Great Again!
Stars: ✭ 275 (+1095.65%)
CoursesQuiz & Assignment of Coursera
Stars: ✭ 454 (+1873.91%)
XlearnHigh performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
Stars: ✭ 2,968 (+12804.35%)
NfstreamNFStream: a Flexible Network Data Analysis Framework.
Stars: ✭ 622 (+2604.35%)
Python Causality HandbookCausal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and sensitivity analysis.
Stars: ✭ 449 (+1852.17%)
Rust DataframeA Rust DataFrame implementation, built on Apache Arrow
Stars: ✭ 271 (+1078.26%)
GophernotesThe Go kernel for Jupyter notebooks and nteract.
Stars: ✭ 3,100 (+13378.26%)
TurbodbcTurbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with the Python Database API Specification 2.0.
Stars: ✭ 449 (+1852.17%)
Data Describedata⎰describe: Pythonic EDA Accelerator for Data Science
Stars: ✭ 269 (+1069.57%)
Matrixprofile TsA Python library for detecting patterns and anomalies in massive datasets using the Matrix Profile
Stars: ✭ 621 (+2600%)
GeopythonNotebooks and libraries for spatial/geo Python explorations
Stars: ✭ 268 (+1065.22%)
FeatexpFeature exploration for supervised learning
Stars: ✭ 688 (+2891.3%)
Pygam[HELP REQUESTED] Generalized Additive Models in Python
Stars: ✭ 569 (+2373.91%)
CudfcuDF - GPU DataFrame Library
Stars: ✭ 4,370 (+18900%)
ShogunShōgun
Stars: ✭ 2,859 (+12330.43%)
BiolitmapCode for the paper "BIOLITMAP: a web-based geolocated and temporal visualization of the evolution of bioinformatics publications" in Oxford Bioinformatics.
Stars: ✭ 18 (-21.74%)
NimbusmlPython machine learning package providing simple interoperability between ML.NET and scikit-learn components.
Stars: ✭ 265 (+1052.17%)
Subreddit AnalyzerA comprehensive Data and Text Mining workflow for submissions and comments from any given public subreddit.
Stars: ✭ 447 (+1843.48%)
Awesome Mlops😎 A curated list of awesome MLOps tools
Stars: ✭ 258 (+1021.74%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+24491.3%)
GoroA High-level Machine Learning Library for Go
Stars: ✭ 265 (+1052.17%)
Metaflow🚀 Build and manage real-life data science projects with ease!
Stars: ✭ 5,108 (+22108.7%)
SktimeA unified framework for machine learning with time series
Stars: ✭ 4,741 (+20513.04%)
Csvs To SqliteConvert CSV files into a SQLite database
Stars: ✭ 568 (+2369.57%)
ArqueroQuery processing and transformation of array-backed data tables.
Stars: ✭ 384 (+1569.57%)
Dash TableA First-Class Interactive DataTable for Dash
Stars: ✭ 382 (+1560.87%)
Querido Diario📰 Brazilian government gazettes, accessible to everyone.
Stars: ✭ 681 (+2860.87%)
Datasets For Recommender SystemsThis is a repository of a topic-centric public data sources in high quality for Recommender Systems (RS)
Stars: ✭ 564 (+2352.17%)