All Projects → pmaji → Data Science Toolkit

pmaji / Data Science Toolkit

Collection of stats, modeling, and data science tools in Python and R.

Programming Languages

python
139335 projects - #7 most used programming language
r
7636 projects

Projects that are alternatives of or similar to Data Science Toolkit

Pycm
Multi-class confusion matrix library in Python
Stars: ✭ 1,076 (+536.69%)
Mutual labels:  data-science, statistics, classification, data-mining, statistical-analysis
Orange3
🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+1765.09%)
Mutual labels:  data-science, classification, data-mining, regression, data-visualization
Moderndive book
Statistical Inference via Data Science: A ModernDive into R and the Tidyverse
Stars: ✭ 527 (+211.83%)
Mutual labels:  data-science, tidyverse, ggplot2, regression, data-visualization
Mlr
Machine Learning in R
Stars: ✭ 1,542 (+812.43%)
Mutual labels:  data-science, statistics, classification, regression
Papers Literature Ml Dl Rl Ai
Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning
Stars: ✭ 1,341 (+693.49%)
Mutual labels:  data-science, statistics, data-mining, reinforcement-learning
Biolitmap
Code for the paper "BIOLITMAP: a web-based geolocated and temporal visualization of the evolution of bioinformatics publications" in Oxford Bioinformatics.
Stars: ✭ 18 (-89.35%)
Mutual labels:  data-science, data-mining, natural-language-processing, data-visualization
Book Socialmediaminingpython
Companion code for the book "Mastering Social Media Mining with Python"
Stars: ✭ 462 (+173.37%)
Mutual labels:  data-science, data-mining, natural-language-processing, data-visualization
Ml
A high-level machine learning and deep learning library for the PHP language.
Stars: ✭ 1,270 (+651.48%)
Mutual labels:  data-science, classification, natural-language-processing, regression
Mlj.jl
A Julia machine learning framework
Stars: ✭ 982 (+481.07%)
Mutual labels:  data-science, statistics, classification, regression
Dat8
General Assembly's 2015 Data Science course in Washington, DC
Stars: ✭ 1,516 (+797.04%)
Mutual labels:  data-science, natural-language-processing, logistic-regression, data-visualization
Metriculous
Measure and visualize machine learning model performance without the usual boilerplate.
Stars: ✭ 71 (-57.99%)
Mutual labels:  data-science, statistics, classification, regression
Openml R
R package to interface with OpenML
Stars: ✭ 81 (-52.07%)
Mutual labels:  data-science, statistics, classification, regression
Smile
Statistical Machine Intelligence & Learning Engine
Stars: ✭ 5,412 (+3102.37%)
Mutual labels:  data-science, statistics, classification, regression
Awesome Fraud Detection Papers
A curated list of data mining papers about fraud detection.
Stars: ✭ 843 (+398.82%)
Mutual labels:  data-science, classification, data-mining, logistic-regression
Tensorflow Book
Accompanying source code for Machine Learning with TensorFlow. Refer to the book for step-by-step explanations.
Stars: ✭ 4,448 (+2531.95%)
Mutual labels:  classification, reinforcement-learning, logistic-regression, regression
Machine Learning From Scratch
Succinct Machine Learning algorithm implementations from scratch in Python, solving real-world problems (Notebooks and Book). Examples of Logistic Regression, Linear Regression, Decision Trees, K-means clustering, Sentiment Analysis, Recommender Systems, Neural Networks and Reinforcement Learning.
Stars: ✭ 42 (-75.15%)
Mutual labels:  data-science, classification, reinforcement-learning, regression
Machine Learning With Python
Practice and tutorial-style notebooks covering wide variety of machine learning techniques
Stars: ✭ 2,197 (+1200%)
Mutual labels:  data-science, statistics, classification, regression
Mlinterview
A curated awesome list of AI Startups in India & Machine Learning Interview Guide. Feel free to contribute!
Stars: ✭ 410 (+142.6%)
Mutual labels:  data-science, statistics, natural-language-processing, reinforcement-learning
Courses
Quiz & Assignment of Coursera
Stars: ✭ 454 (+168.64%)
Mutual labels:  data-science, natural-language-processing, reinforcement-learning, data-visualization
Php Ml
PHP-ML - Machine Learning library for PHP
Stars: ✭ 7,900 (+4574.56%)
Mutual labels:  data-science, classification, data-mining, regression

Introduction

Welcome! The purpose of this repository is to serve as stockpile of statistical methods, modeling techniques, and data science tools. The content itself includes everything from educational vignettes on specific topics, to tailored functions and modeling pipelines built to enhance and optimize analyses, to notes and code from various data science conferences, to general data science utilities. This will remain a work in progress, and I welcome all contributions and constructive criticism. If you have a suggestion or request, please use the "Issues" tab and I will endeavor to respond expeditiously!

Note: GitHub often has trouble rendering larger .ipynb files in particular. If you find that you are unable to view one of the jupyter notebooks linked below, I recommend copy and pasting the result into jupyter's nbviewer, which will take you to a viewable link like this one here for my "Visualization with Plotly" notebook. Note that if you want to ensure that you are viewing the most up-to-date version of the notebook with nbviewer, you should add ?flush_cache=true to the end of the generated URL as is described here; otherwise, your link risks being slightly out-of-date.

Table of Contents

  1. Playground and Basics
    1. Rough Notes from ISLR Exercises -- R
    2. Rough Notes from Python Data Scientist Track -- Python
  2. Exploratory Data Analysis (EDA) and Visualization
    1. Practical Data Visualization with Python (Full Course) -- Python
    2. EDA and Basic Viz. -- R
    3. Visualizing Geographic Data -- Python
    4. Radar Charts -- Python
  3. Hypothesis Testing
    1. Kolmogorov-Smirnov Test (KS Test) -- R
    2. Useful Hypothesis Testing Functions -- R
  4. Classification
    1. Logistic Regression (Ridge and Lasso Methods Included) -- R
    2. Useful Classification Functions -- R
    3. Basic Tree Models -- R
    4. KNN -- R
  5. Regression
    1. Linear Regression -- Python
  6. Reinforcement Learning
  7. Text Mining and Natural Language Processing (NLP)
    1. Basic Texting Mining and NLP -- R
  8. Time Series
    1. Time Series Forecasting with Facebook's Prophet Package -- Python
  9. Notes and Material from Data Science Conferences
    1. PyData 2018 DC Conference (Notes and Tutorial Code) -- Python
    2. Max Khun / RStudio Supervised Learning 2019 DC Conference -- R
    3. PyCon 2019 Conference (Notes and Session Code) -- Python
  10. Utilities
    1. HTML File Appender (Using Beautiful Soup) -- Python

Contribution Info

All are welcome and encouraged to contribute to this repository. My only request is that you include a detailed description of your contribution, that your code be thoroughly-commented, and that you test your contribution locally with the most recent version of the master branch integrated prior to submitting the PR.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].