All Projects → everdark → k9

everdark / k9

Licence: other
Self-Taught Data Science

Programming Languages

HTML
75241 projects
Jupyter Notebook
11667 projects
javascript
184084 projects - #8 most used programming language
python
139335 projects - #7 most used programming language
CSS
56736 projects
TeX
3793 projects

Projects that are alternatives of or similar to k9

awesome-datascience-python
Awesome list Data Science and Python. 🐍
Stars: ✭ 62 (+148%)
Mutual labels:  statistics
dml
R package for Distance Metric Learning
Stars: ✭ 58 (+132%)
Mutual labels:  statistics
foremast-brain
Foremast-brain is a component of Foremast project.
Stars: ✭ 17 (-32%)
Mutual labels:  statistics
Expectations.jl
Expectation operators for Distributions.jl objects
Stars: ✭ 50 (+100%)
Mutual labels:  statistics
HeroesMatchTracker
Heroes of the Storm match tracker for personal statistics
Stars: ✭ 59 (+136%)
Mutual labels:  statistics
veridical-flow
Making it easier to build stable, trustworthy data-science pipelines.
Stars: ✭ 28 (+12%)
Mutual labels:  statistics
FantasyPremierLeague.py
⚽ Statistics for your mini leagues.
Stars: ✭ 123 (+392%)
Mutual labels:  statistics
kitsu-season-trends
🦊 Kitsu seasonal anime trends
Stars: ✭ 13 (-48%)
Mutual labels:  statistics
Algorithmic-Trading
I have been deeply interested in algorithmic trading and systematic trading algorithms. This Repository contains the code of what I have learnt on the way. It starts form some basic simple statistics and will lead up to complex machine learning algorithms.
Stars: ✭ 47 (+88%)
Mutual labels:  statistics
procstat
Easy way to expose process internal state to filesystem using fuse.
Stars: ✭ 14 (-44%)
Mutual labels:  statistics
tics
🎢 Simple self-hosted analytics ideal for Express / React Native stacks
Stars: ✭ 22 (-12%)
Mutual labels:  statistics
scanstatistics
An R package for space-time anomaly detection using scan statistics.
Stars: ✭ 41 (+64%)
Mutual labels:  statistics
rsiena
An R package for Simulation Investigation for Empirical Network Analysis
Stars: ✭ 56 (+124%)
Mutual labels:  statistics
wrapperr
Website and API that collects Plex statistics using Tautulli and displays it. Similar to the Spotify Wrapped concept.
Stars: ✭ 93 (+272%)
Mutual labels:  statistics
yt-channels-DS-AI-ML-CS
A comprehensive list of 180+ YouTube Channels for Data Science, Data Engineering, Machine Learning, Deep learning, Computer Science, programming, software engineering, etc.
Stars: ✭ 1,038 (+4052%)
Mutual labels:  statistics
hdfe
No description or website provided.
Stars: ✭ 22 (-12%)
Mutual labels:  statistics
hmac-timing-attacks
HMAC timing attack's w/ statistical analysis
Stars: ✭ 22 (-12%)
Mutual labels:  statistics
math-stats
A small library that does the statistics for your numbers.
Stars: ✭ 18 (-28%)
Mutual labels:  statistics
kf2-magicked-admin
🕷️ Mutator-free management, statistics, and in-game bot for ranked Killing Floor 2 servers
Stars: ✭ 27 (+8%)
Mutual labels:  statistics
vtuber-livechat-dataset
📊 VTuber 1B: Billion-scale Live Chat and Moderation Event Dataset for NLP
Stars: ✭ 30 (+20%)
Mutual labels:  statistics

Self-Taught Data Science Playground

The repository is a collection of my self-taught notebooks for data science theories and practices. A huge effort is made to strike a balance between methodology derivation (with math) and hands-on coding. The target audience is data science practitioners (including myself) with hands-on experiences who are seeking for more in-depth understandings of machine learning algorithms and relevant statistics.

Here to visit the web site Hello, Data Science! hosting all the notebooks in nicely rendered HTML.

Notebooks Summary

notebooks/

A notebook is written in either Jupyter or R markdown. The major programming languages used for most of the notebooks are Python and/or R. You may find me sometimes inter-operate the two langauges in a single notebook. This is achieved thanks to reticulate.

Laboratory Scripts

labs/

These are quick-and-dirty scripts to explore a variety of open source machine learning tools. They may not be completed and can be messy to read.

[Optional] Setup Python Environment

To ensure reproducibility it is recommended to use pyenv along with pyenv-virtualenv to control both Python and package version.

pyenv support only Linux and macOS. For Windows user it is recommended to use conda instead.

Install Different Python Version

To use virtualenv with reticulate in Rmd, the involved Python must be installed with shared library:

PYTHON_CONFIGURE_OPTS="--enable-shared" pyenv install 3.7.0

Create virtualenv

Each notebook has different package dependencies. Here is an example to create an environment specific for the notebook on model explainability:

cd notebooks/ml/model_explain
pyenv virtualenv 3.7.0 k9-model-explain
pyenv local k9-model-explain
pip install --upgrade pip
pip install -r requirements.txt

TODO

Topics

  • Machine Learning
    • Factorization Machines
    • Recurrent Neural Nets
    • Sequence-to-Sequence Models
    • GANs
    • Reinforcement Learning Basics
    • Approximated Nearest Neighbor
  • Statistics
    • Law of Large Numbers and Central Limit Theorem
    • On Linear Regression: Machine Learning vs Econometrics
    • Linear Mixed Effects Models
    • Naive Bayes
    • Bayesian Model Diagnostic
    • Bayesian Time Series Forecasting
  • Tools/Programming
    • PyTorch Hands-On
    • RASA Chatbot Framework Hands-On
  • Programming
    • R
      • Production Quality Shiny App Development
    • Python
      • Dash for Interactive Dashboarding
  • Projects
    • Model Deployment with gRRC

Site

  • Dockerize each notebook (for complete reproducibility and portability)?
  • Tidy up dependencies for each notebook
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].