All Projects → ShopRunner → Octopod

ShopRunner / Octopod

Licence: bsd-3-clause
Train multi-task image, text, or ensemble (image + text) models

Projects that are alternatives of or similar to Octopod

Ocaml Jupyter
An OCaml kernel for Jupyter (IPython) notebook
Stars: ✭ 177 (+391.67%)
Mutual labels:  jupyter-notebook, datascience
Introduction Datascience Python Book
Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications
Stars: ✭ 275 (+663.89%)
Mutual labels:  jupyter-notebook, datascience
Virgilio
Virgilio is developed and maintained by these awesome people. You can email us virgilio.datascience (at) gmail.com or join the Discord chat.
Stars: ✭ 13,200 (+36566.67%)
Mutual labels:  jupyter-notebook, datascience
Oreilly Intro To Predictive Clv
Repo that contains the supporting material for O'Reilly Webinar "An Intro to Predictive Modeling for Customer Lifetime Value" on Feb 28, 2017
Stars: ✭ 153 (+325%)
Mutual labels:  jupyter-notebook, datascience
Ai Series
📚 [.md & .ipynb] Series of Artificial Intelligence & Deep Learning, including Mathematics Fundamentals, Python Practices, NLP Application, etc. 💫 人工智能与深度学习实战,数理统计篇 | 机器学习篇 | 深度学习篇 | 自然语言处理篇 | 工具实践 Scikit & Tensoflow & PyTorch 篇 | 行业应用 & 课程笔记
Stars: ✭ 702 (+1850%)
Mutual labels:  jupyter-notebook, datascience
Data Science Resources
👨🏽‍🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (+375%)
Mutual labels:  jupyter-notebook, datascience
Notebooks Statistics And Machinelearning
Jupyter Notebooks from the old UnsupervisedLearning.com (RIP) machine learning and statistics blog
Stars: ✭ 270 (+650%)
Mutual labels:  jupyter-notebook, datascience
An Introduction To Statistical Learning
This repository contains the exercises and its solution contained in the book "An Introduction to Statistical Learning" in python.
Stars: ✭ 1,843 (+5019.44%)
Mutual labels:  jupyter-notebook, datascience
Business Machine Learning
A curated list of practical business machine learning (BML) and business data science (BDS) applications for Accounting, Customer, Employee, Legal, Management and Operations (by @firmai)
Stars: ✭ 575 (+1497.22%)
Mutual labels:  jupyter-notebook, datascience
Or Pandas
【运筹OR帷幄|数据科学】pandas教程系列电子书
Stars: ✭ 492 (+1266.67%)
Mutual labels:  jupyter-notebook, datascience
Data science blogs
A repository to keep track of all the code that I end up writing for my blog posts.
Stars: ✭ 139 (+286.11%)
Mutual labels:  jupyter-notebook, datascience
Numerical Computing Is Fun
Learning numerical computing with notebooks for all ages.
Stars: ✭ 730 (+1927.78%)
Mutual labels:  jupyter-notebook, datascience
The Data Science Workshop
A New, Interactive Approach to Learning Data Science
Stars: ✭ 126 (+250%)
Mutual labels:  jupyter-notebook, datascience
100 Days Of Ml Code
A day to day plan for this challenge. Covers both theoritical and practical aspects
Stars: ✭ 172 (+377.78%)
Mutual labels:  jupyter-notebook, datascience
Pbpython
Code, Notebooks and Examples from Practical Business Python
Stars: ✭ 1,724 (+4688.89%)
Mutual labels:  jupyter-notebook, datascience
Melusine
Melusine is a high-level library for emails classification and feature extraction "dédiée aux courriels français".
Stars: ✭ 222 (+516.67%)
Mutual labels:  jupyter-notebook, datascience
Repo2docker Action
GitHub Action for repo2docker
Stars: ✭ 88 (+144.44%)
Mutual labels:  jupyter-notebook, datascience
Deep Ml Meetups
A central repository for all my projects
Stars: ✭ 108 (+200%)
Mutual labels:  jupyter-notebook, datascience
Code
Compilation of R and Python programming codes on the Data Professor YouTube channel.
Stars: ✭ 287 (+697.22%)
Mutual labels:  jupyter-notebook, datascience
Industry Machine Learning
A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)
Stars: ✭ 6,077 (+16780.56%)
Mutual labels:  jupyter-notebook, datascience

octopod logo

Octopod

Octopod is a general purpose deep learning library developed by the ShopRunner Data Science team to train multi-task image, text, or ensemble (image + text) models.

What differentiates our library is that you can train a multi-task model with different datasets for each of your tasks. For example, you could train one model to label dress length for dresses and pants length for pants.

See the docs for more details.

To quickly get started, check out one of our tutorials in the notebooks folder. In particular, the synthetic_data tutorial provides a very quick example of how the code works.

Note 7/08/20: We are renaming this repository Octopod (previously called Tonks). The last version of the PyPI library under the name Tonks will not break but will warn the user to begin installing and using Octopod instead. No further development will continue under the name Tonks.

Note 6/12/20: Our team previously had a tradition of naming projects with terms or characters from the Harry Potter series, but we are disappointed by J.K. Rowling’s persistent transphobic comments. In response, we will be renaming this repository, and are working to develop an inclusive solution that minimizes disruption to our users.

Structure

  • notebooks
    • fashion_data: a set of notebooks demonstrating training Octopod models on an open source fashion dataset consisting of images and text descriptions
    • synthetic_data: a set of notebooks demonstrating training Octopod models on a set of generated color swatches. This is meant to be an easy fast demo of the library's capabilities that can be run on CPU's.
  • octopod
    • ensemble: code for ensemble models of text and vision models
    • text: code for text models with a BERT architecture
    • vision: code for vision models with ResNet50 architectures

Installation

pip install octopod

You may get an error from the tokenizer package if you do not have a Rust compiler installed; see https://github.com/huggingface/transformers/issues/2831#issuecomment-592724471.

Notes

Currently, this library supports ResNet50 and BERT models.

In some of our documentation the terms pretrained and vanilla appear. pretrained is our shorthand for Octopod models that have been trained at least once already so their weights have been tuned for a specific use case. vanilla is our shorthand for base weights coming from transformers or PyTorch for the out-of-the-box BERT and ResNet50 models.

For our examples using text models, we use the transformers repository managed by huggingface. The most recent version is called transformers. The huggingface repo is the appropriate place to check on BERT documentation and procedures.

Development

Want to add to or fix issues in Octopod? We welcome outside input and have tried to make it easier to test. You can run everything inside a docker container with the following:

# to build the container
# NOTE: this may take a while
docker build -t octopod .
# nvidia-docker run : basic startup with nvidia docker to access gpu
# --rm : deletes container when closed
# -p : exposes ports (ex: for jupyter notebook to work)
# bash : opens bash in the container once it starts
# "pip install jupyter && bash" : install requirements-dev and bash
nvidia-docker run \
    -it \
    --rm \
    -v "${PWD}:/octopod" \
    -p 8888:8888 \
    octopod /bin/bash -c "pip install jupyter && bash"
# run jupyter notebook
jupyter notebook --ip 0.0.0.0 --no-browser --allow-root --NotebookApp.token='' --NotebookApp.password=''
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].