All Projects → leriomaggio → pytorch-beautiful-ml-data

leriomaggio / pytorch-beautiful-ml-data

Licence: GPL-3.0, CC-BY-SA-4.0 licenses found Licenses found GPL-3.0 LICENSE.txt CC-BY-SA-4.0 LICENSE-CC-BY-SA
PyData Global Tutorial on Data Patterns and OOP abstractions for Deep Learning using PyTorch

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to pytorch-beautiful-ml-data

Pytorch1.0 Cn
PyTorch 1.0 官方文档 中文版,欢迎关注微信公众号:磐创AI
Stars: ✭ 215 (+1553.85%)
Mutual labels:  pytorch-tutorial
deep-blueberry
If you've always wanted to learn about deep-learning but don't know where to start, then you might have stumbled upon the right place!
Stars: ✭ 17 (+30.77%)
Mutual labels:  pytorch-tutorial
Awesome-Pytorch-Tutorials
Awesome Pytorch Tutorials
Stars: ✭ 23 (+76.92%)
Mutual labels:  pytorch-tutorial
Pytorch Sentiment Analysis
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+24584.62%)
Mutual labels:  pytorch-tutorial
Text-Classification-LSTMs-PyTorch
The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Stars: ✭ 45 (+246.15%)
Mutual labels:  pytorch-tutorial
Receptive-Field-in-Pytorch
Numerical Computation of Receptive Field in Pytorch
Stars: ✭ 57 (+338.46%)
Mutual labels:  pytorch-tutorial
A Pytorch Tutorial To Text Classification
Hierarchical Attention Networks | a PyTorch Tutorial to Text Classification
Stars: ✭ 184 (+1315.38%)
Mutual labels:  pytorch-tutorial
PyTorchStepByStep
Official repository of my book: "Deep Learning with PyTorch Step-by-Step: A Beginner's Guide"
Stars: ✭ 314 (+2315.38%)
Mutual labels:  pytorch-tutorial
Duke-NLP-WS-2020
Duke Natural Language Processing Winter School 2020
Stars: ✭ 22 (+69.23%)
Mutual labels:  pytorch-tutorial
Tutorial-Competition-2018
PyTorch KR Tutorial Competition 2018
Stars: ✭ 60 (+361.54%)
Mutual labels:  pytorch-tutorial
Pytorch Seq2seq
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Stars: ✭ 3,418 (+26192.31%)
Mutual labels:  pytorch-tutorial
mrnet
Building an ACL tear detector to spot knee injuries from MRIs with PyTorch (MRNet)
Stars: ✭ 98 (+653.85%)
Mutual labels:  pytorch-tutorial
Pytorch-conditional-GANs
Implementation of Conditional Generative Adversarial Networks in PyTorch
Stars: ✭ 91 (+600%)
Mutual labels:  pytorch-tutorial
Book deeplearning in pytorch source
Stars: ✭ 236 (+1715.38%)
Mutual labels:  pytorch-tutorial
Custom-CNN-based-Image-Classification-in-PyTorch
No description or website provided.
Stars: ✭ 41 (+215.38%)
Mutual labels:  pytorch-tutorial
Pytorch Beginner
pytorch tutorial for beginners
Stars: ✭ 2,603 (+19923.08%)
Mutual labels:  pytorch-tutorial
deep-dream-pytorch
Pytorch implementation of DeepDream on VGG16 Network
Stars: ✭ 46 (+253.85%)
Mutual labels:  pytorch-tutorial
ATA-GAN
Demo code for Attention-Aware Generative Adversarial Networks paper
Stars: ✭ 13 (+0%)
Mutual labels:  pytorch-tutorial
pytorch-examples-cn
用例子学习PyTorch1.0(Learning PyTorch with Examples 中文翻译与学习)
Stars: ✭ 54 (+315.38%)
Mutual labels:  pytorch-tutorial
intro-computervision
Notebooks for learning about the layers of a convolutional neural network.
Stars: ✭ 44 (+238.46%)
Mutual labels:  pytorch-tutorial

Beautiful Data for Machine Learning

Patterns & Best Practice for effective Data solutions with PyTorch

pydata-logo This tutorial will be presented at PyData Global 2020 conference

Abstract

Data is essential in Machine Learning, and PyTorch offers a very Pythonic solution to load complex and heterogeneous dataset. However, data loading is merely the first step: preprocessing|batching|sampling|partitioning|augmenting.

This tutorial explores the internals of torch.utils.data, and describes patterns and best practices for elegant data solutions in Machine and Deep learning with PyTorch.

Get started

If you want to start digging into examples and patterns, there is a Cover notebook to get you started.

Outline

  1. Part 1 (Prelude)

    • Data Representation for Machine Learning
  2. Part 2 Intro to Dataset and DataLoader

    • torch.utils.data.Dataset at a glance
    • Case Study: FER Dataset
  3. Part 3 Data Transformation and Sampling

    • torchvision transformers
    • Case Study: Custom (Random) transformers
      • Transformer pipelines with torchvision.transforms.Compose
    • Data Sampling and Data Loader
      • Handling imbalanced samples in FER data
  4. Part 4 Data Partitioning (training / validation / test ): the PyTorch way

    • One Dataset is One Dataset
    • Subset and random_split
    • Case Study: Dataset and Cross-Validation
      • How to combine torch.utils.data.Dataset and sklearn.model_selection.KFold (without using skorch)
      • Combining Data Partitioning and Transformers
  5. Part 5 Data Abstractions for Image Segmentation

    • dataclass and Python Data Model
    • Case Study for Digital Pathology
    • Working with tiles and Patches
      • Patches in Batches for Spleen Segmentation

Description

Data processing is at the heart of every Machine Learning (ML) model training&evaluation loop; and PyTorch has revolutionised the way in which data is managed. Very Pythonic Dataset and DataLoader classes substitutes substitutes (nested) list of Numpy ndarray.

However data loading is merely the first step. Data preprocessing|sampling|batching|partitioning are fundamental operations that are usually required in a complete ML pipeline.

If not properly managed, this could ultimately lead to lots of boilerplate code, re-inventing the wheel ™. This tutorial will dig into the internals of torch.utils.data to present patterns and best practice to load heterogeneous and custom dataset in the most elegant and Pythonic way.

The tutorial is organised in four parts, each focusing on specific patterns for ML data and scenarios. These parts will share the same internal structure: (I) general introduction; (II) case study.

The first section will provide a technical introduction of the problem, and a description of the torch internals. Case studies are then used to deliver concrete examples, and application, as well as engaging with the audience, and fostering the discussion. Off-the-shelf and/or custom heterogeneuous datasets will be used to comply with the broadest possible interests from the audience (e.g. Images, Text, Mixed-Type Datasets).

Pre-requisites

Basic concepts of Machine/Deep learning Data processing are required to attend this tutorial. Similarly, proficiency with the Python language and the Python Object Model is also required. Basic knowledge of the PyTorch main features is preferable.

Setting up the Python Environment

It is possible to create the Python virtual environment to run all the notebooks in this repository either using conda (for Anaconda Python distribution) or pyenv and pip.

To setup the Anaconda environment:

$ conda env create -f torch_beautiful_data.yml

This will create a new virtual environment called torch-beautiful-data.

$ conda activate torch-beautiful-data

to activate the environment.

At this stage, you're all set and you should be ready to start playing with the notebook. So, run a jupyter notebook server on your local computer, by running the following command in your Terminal:

$ jupyter notebook

Have fun! 🎉

Note: Alternatively, if you would prefer installing the required packages using pip, it is very simple. Just run the following command:

$ pip install -r requirements.txt

Acknowledgments

Public shout out to all PyData Global organisers, and to Matthijs in particular for his wonderful support during the preparation of this tutorial!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].