All Projects → lightly-ai → Lightly

lightly-ai / Lightly

Licence: mit
A python library for self-supervised learning on images.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Lightly

Persian-Sentiment-Analyzer
Persian sentiment analysis ( آناکاوی سهش های فارسی | تحلیل احساسات فارسی )
Stars: ✭ 30 (-93.17%)
Mutual labels:  embeddings
Decagon
Graph convolutional neural network for multirelational link prediction
Stars: ✭ 268 (-38.95%)
Mutual labels:  embeddings
Nlp Cube
Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing
Stars: ✭ 353 (-19.59%)
Mutual labels:  embeddings
game2vec
TensorFlow implementation of word2vec applied on https://www.kaggle.com/tamber/steam-video-games dataset, using both CBOW and Skip-gram.
Stars: ✭ 62 (-85.88%)
Mutual labels:  embeddings
Hetu
A high-performance distributed deep learning system targeting large-scale and automated distributed training.
Stars: ✭ 78 (-82.23%)
Mutual labels:  embeddings
Paperlist For Recommender Systems
Recommender Systems Paperlist that I am interested in
Stars: ✭ 293 (-33.26%)
Mutual labels:  embeddings
entity-embed
PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.
Stars: ✭ 96 (-78.13%)
Mutual labels:  embeddings
Multi Class Text Classification Cnn
Classify Kaggle Consumer Finance Complaints into 11 classes. Build the model with CNN (Convolutional Neural Network) and Word Embeddings on Tensorflow.
Stars: ✭ 410 (-6.61%)
Mutual labels:  embeddings
Hub
A library for transfer learning by reusing parts of TensorFlow models.
Stars: ✭ 3,007 (+584.97%)
Mutual labels:  embeddings
Contextualized Topic Models
A python package to run contextualized topic modeling. CTMs combine BERT with topic models to get coherent topics. Also supports multilingual tasks. Cross-lingual Zero-shot model published at EACL 2021.
Stars: ✭ 318 (-27.56%)
Mutual labels:  embeddings
cade
Compass-aligned Distributional Embeddings. Align embeddings from different corpora
Stars: ✭ 29 (-93.39%)
Mutual labels:  embeddings
keras-word-char-embd
Concatenate word and character embeddings in Keras
Stars: ✭ 43 (-90.21%)
Mutual labels:  embeddings
Cleora
Cleora AI is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.
Stars: ✭ 303 (-30.98%)
Mutual labels:  embeddings
watset-java
An implementation of the Watset clustering algorithm in Java.
Stars: ✭ 24 (-94.53%)
Mutual labels:  embeddings
Fast sentence embeddings
Compute Sentence Embeddings Fast!
Stars: ✭ 384 (-12.53%)
Mutual labels:  embeddings
FaceRecognition With FaceNet Android
Face Recognition using the FaceNet model and MLKit on Android.
Stars: ✭ 113 (-74.26%)
Mutual labels:  embeddings
Polyfuzz
Fuzzy string matching, grouping, and evaluation.
Stars: ✭ 292 (-33.49%)
Mutual labels:  embeddings
Nimfa
Nimfa: Nonnegative matrix factorization in Python
Stars: ✭ 440 (+0.23%)
Mutual labels:  embeddings
Lmdb Embeddings
Fast word vectors with little memory usage in Python
Stars: ✭ 404 (-7.97%)
Mutual labels:  embeddings
Vectorhub
Vector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)
Stars: ✭ 317 (-27.79%)
Mutual labels:  embeddings

Lightly Logo

GitHub Unit Tests codecov

Lightly is a computer vision framework for self-supervised learning.

We, at Lightly, are passionate engineers who want to make deep learning more efficient. We want to help popularize the use of self-supervised methods to understand and filter raw image data. Our solution can be applied before any data annotation step and the learned representations can be used to analyze and visualize datasets as well as for selecting a core set of samples.

Supported Models

Tutorials

Want to jump to the tutorials and see lightly in action?

Benchmarks

Currently implemented models and their accuracy on cifar10. All models have been evaluated using kNN. We report the max test accuracy over the epochs as well as the maximum GPU memory consumption. All models in this benchmark use the same augmentations as well as the same ResNet-18 backbone. Training precision is set to FP32 and SGD is used as an optimizer with cosineLR. One epoch on cifar10 takes ~35 secondson a V100 GPU. Learn more about the cifar10 benchmark here

Model Epochs Batch Size Test Accuracy Peak GPU usage
MoCo 200 128 0.83 2.1 GBytes
SimCLR 200 128 0.78 2.0 GBytes
SimSiam 200 128 0.73 3.0 GBytes
MoCo 200 512 0.85 7.4 GBytes
SimCLR 200 512 0.83 7.8 GBytes
SimSiam 200 512 0.81 7.0 GBytes
MoCo 800 128 0.89 2.1 GBytes
SimCLR 800 128 0.87 1.9 GBytes
SimSiam 800 128 0.80 2.0 GBytes
MoCo 800 512 0.90 7.2 GBytes
SimCLR 800 512 0.89 7.7 GBytes
SimSiam 800 512 0.91 6.9 GBytes

Terminology

Below you can see a schematic overview of the different concepts present in the lightly Python package. The terms in bold are explained in more detail in our documentation.

Overview of the lightly pip package

Quick Start

Lightly requires Python 3.6+. We recommend installing Lightly in a Linux or OSX environment.

Requirements

  • hydra-core>=1.0.0
  • numpy>=1.18.1
  • pytorch_lightning>=1.0.4
  • requests>=2.23.0
  • torchvision
  • tqdm

Installation

You can install Lightly and its dependencies from PyPI with:

pip3 install lightly

We strongly recommend that you install Lightly in a dedicated virtualenv, to avoid conflicting with your system packages.

Command-Line Interface

Lightly is accessible also through a command-line interface (CLI). To train a SimCLR model on a folder of images you can simply run the following command:

lightly-train input_dir=/mydataset

To create an embedding of a dataset you can use:

lightly-embed input_dir=/mydataset checkpoint=/mycheckpoint

The embeddings with the corresponding filename are stored in a human-readable .csv file.

Next Steps

Head to the documentation and see the things you can achieve with Lightly!

Development

To install dev dependencies (for example to contribute to the framework) you can use the following command:

pip3 install -e ".[dev]"

For more information about how to contribute have a look here.

Running Tests

Unit tests are within the tests folder and we recommend to run them using pytest. There are two test configurations available. By default only a subset will be run. This is faster and should take less than a minute. You can run it using

python -m pytest -s -v

To run all tests (including the slow ones) you can use the following command.

python -m pytest -s -v --runslow

Code Linting

We provide a Pylint config following the Google Python Style Guide.

You can run the linter from your terminal either on a folder

pylint lightly/

or on a specific file

pylint lightly/core.py

Further Reading

Self-supervised Learning:

FAQ

  • Why should I care about self-supervised learning? Aren't pre-trained models from ImageNet much better for transfer learning?

    • Self-supervised learning has become increasingly popular among scientists over the last year because the learned representations perform extraordinarily well on downstream tasks. This means that they capture the important information in an image better than other types of pre-trained models. By training a self-supervised model on your dataset, you can make sure that the representations have all the necessary information about your images.
  • How can I contribute?

    • Create an issue if you encounter bugs or have ideas for features we should implement. You can also add your own code by forking this repository and creating a PR. More details about how to contribute with code is in our contribution guide.
  • Is this framework for free?

    • Yes, this framework completely free to use and we provide the code. We believe that we need to make training deep learning models more data efficient to achieve widespread adoption. One step to achieve this goal is by leveraging self-supervised learning. The company behind lightly commited to keep this framework open-source.
  • If this framework is free, how is the company behind lightly making money?

    • Training self-supervised models is only part of the solution. The company behind lightly focuses on processing and analyzing embeddings created by self-supervised models. By building, what we call a self-supervised active learning loop we help companies understand and work with their data more efficiently. This framework acts as an interface for our platform to easily upload and download datasets, embeddings and models. Whereas the platform will cost for additional features this frameworks will always remain free of charge (even for commercial use).
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].