All Projects → ViCCo-Group → THINGSvision

ViCCo-Group / THINGSvision

Licence: MIT license
Python package for extracting and analyzing image representations from state-of-the-art neural networks for computer vision

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects
shell
77523 projects

Projects that are alternatives of or similar to THINGSvision

Vertical Rhythm
Put some typographical vertical rhythm in your CSS. LESS, Stylus and SCSS/SASS versions included.
Stars: ✭ 83 (-21.7%)
Mutual labels:  alignment
Subaligner
Automatically synchronize subtitles to audiovisual content with a pretrained deep neural network and forced alignments. https://subaligner.readthedocs.io/
Stars: ✭ 181 (+70.75%)
Mutual labels:  alignment
Rmsd
Calculate Root-mean-square deviation (RMSD) of two molecules, using rotation, in xyz or pdb format
Stars: ✭ 215 (+102.83%)
Mutual labels:  alignment
Sortmerna
SortMeRNA: next-generation sequence filtering and alignment tool
Stars: ✭ 108 (+1.89%)
Mutual labels:  alignment
Vcf2phylip
Convert SNPs in VCF format to PHYLIP, NEXUS, binary NEXUS, or FASTA alignments for phylogenetic analysis
Stars: ✭ 126 (+18.87%)
Mutual labels:  alignment
Smartsystemmenu
SmartSystemMenu extends system menu of all windows in the system
Stars: ✭ 209 (+97.17%)
Mutual labels:  alignment
Sibeliaz
A fast whole-genome aligner based on de Bruijn graphs
Stars: ✭ 76 (-28.3%)
Mutual labels:  alignment
Scan2cad
[CVPR'19] Dataset and code used in the research project Scan2CAD: Learning CAD Model Alignment in RGB-D Scans
Stars: ✭ 249 (+134.91%)
Mutual labels:  alignment
Aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Stars: ✭ 1,942 (+1732.08%)
Mutual labels:  alignment
Ngmlr
NGMLR is a long-read mapper designed to align PacBio or Oxford Nanopore (standard and ultra-long) to a reference genome with a focus on reads that span structural variations
Stars: ✭ 215 (+102.83%)
Mutual labels:  alignment
Stfan
Code repo for "Spatio-Temporal Filter Adaptive Network for Video Deblurring" (ICCV'19)
Stars: ✭ 110 (+3.77%)
Mutual labels:  alignment
3ddfa v2
The official PyTorch implementation of Towards Fast, Accurate and Stable 3D Dense Face Alignment, ECCV 2020.
Stars: ✭ 1,961 (+1750%)
Mutual labels:  alignment
Genomeworks
SDK for GPU accelerated genome assembly and analysis
Stars: ✭ 215 (+102.83%)
Mutual labels:  alignment
Awesome Image Alignment And Stitching
A curated list of awesome resources for image alignment and stitching ...
Stars: ✭ 101 (-4.72%)
Mutual labels:  alignment
Pedestrian alignment
TCSVT2018 Pedestrian Alignment Network for Large-scale Person Re-identification
Stars: ✭ 223 (+110.38%)
Mutual labels:  alignment
Beautifyr
RStudio addin for formatting Rmarkdown tables
Stars: ✭ 77 (-27.36%)
Mutual labels:  alignment
Mtcnn Accelerate Onet
MTCNN Face Detection & Alignment
Stars: ✭ 203 (+91.51%)
Mutual labels:  alignment
indigo
Indigo: SNV and InDel Discovery in Chromatogram traces obtained from Sanger sequencing of PCR products
Stars: ✭ 26 (-75.47%)
Mutual labels:  alignment
Hh Suite
Remote protein homology detection suite.
Stars: ✭ 230 (+116.98%)
Mutual labels:  alignment
Core Layout
Flexbox & CSS-style Layout in Swift.
Stars: ✭ 215 (+102.83%)
Mutual labels:  alignment


📔 Table of Contents

🌟 About the Project

thingsvision is a Python package that let's you easily extract image representations from many state-of-the-art neural networks for computer vision. In a nutshell, you feed thingsvision with a directory of images and tell it which neural network you are interested in. thingsvision will then give you the representation of the indicated neural network for each image so that you will end up with one feature vector per image. You can use these feature vectors for further analyses. We use the word features for short when we mean "image representation".

🚨 Note: some function calls mentioned in the paper have been deprecated. To use this package successfully, exclusively follow this README and the documentation. 🚨

(back to top)

🦾 Functionality

With thingsvision, you can:

  • extract features for any imageset from many popular networks.
  • extract features for any imageset from your custom networks.
  • extract features for >26,000 images from the THINGS image database.
  • optionally turn off the standard center cropping performed by many networks before extracting features.
  • extract features from HDF5 datasets directly (e.g. NSD stimuli)
  • conduct basic Representational Similarity Analysis (RSA) after feature extraction.
  • perform Centered Kernel Alignment (CKA) to compare image features across model-module combinations.

(back to top)

🗄️ Model collection

Neural networks come from different sources. With thingsvision, you can extract image representations of all models from:

  • torchvision
  • Keras
  • timm
  • ssl (Self-Supervised Learning Models)
    • simclr-rn50, mocov2-rn50, jigsaw-rn50, rotnet-rn50, swav-rn50, pirl-rn50 (retrieved from vissl)
    • barlowtwins-rn50, vicreg-rn50, dino-vit{s/b}{8/16}, dino-xcit-{small/medium}-{12/24}-p{8/16}, dino-rn50 (retrieved from torch.hub)
  • OpenCLIP
  • both original CLIP variants (ViT-B/32 and RN50)
  • a few custom models (Alexnet, VGG-16, Resnet50, and Inception_v3) trained on Ecoset rather than ImageNet and one Alexnet pretrained on ImageNet and fine-tuned on SalObjSub
  • each of the many CORnet versions
  • Harmonization models from the official repo. The default variant is ViT_B16. However, the following encoders are additionally available: ResNet50, VGG16, EfficientNetB0, tiny_ConvNeXT, tiny_MaxViT, LeViT_small

(back to top)

🏃 Getting Started

💻 Setting up your environment

Working locally.

First, create a new conda environment with Python version 3.8, 3.9, or 3.10 e.g. by using conda:

$ conda create -n thingsvision python=3.9
$ conda activate thingsvision

Then, activate the environment and simply install thingsvision via running the following pip command in your terminal.

$ pip install --upgrade thingsvision
$ pip install git+https://github.com/openai/CLIP.git

If you want to extract features for harmonized models from the Harmonization repo, you have to additionally run the following pip command in your thingsvision environment (FYI: as of now, this seems to be working smoothly on Ubuntu only but not on macOS),

$ pip install git+https://github.com/serre-lab/Harmonization.git
$ pip install keras-cv-attention-models>=1.3.5

Google Colab.

Alternatively, you can use Google Colab to play around with thingsvision by uploading your image data to Google Drive (via directory mounting). You can find the jupyter notebook using PyTorch here and the TensorFlow example here.

(back to top)

🔍 Basic usage

Command Line Interface (CLI)

thingsvision was designed to simplify feature extraction. If you have some folder of images (e.g., ./images) and want to extract features for each of these images without opening a Jupyter Notebook instance or writing a Python script, it's probably easiest to use our CLI. The interface includes two options,

  • thingsvision show-model
  • thingsvision extract-features

Example calls might look as follows:

thingsvision show-model --model-name "alexnet" --source "torchvision"
thingsvision extract_features --image-root "./data" --model-name "alexnet" --module-name "features.10" --batch-size 32 --device "cuda" --source "torchvision" --file-format "npy" --out-path "./features"

See thingsvision show-model -h and thingsvision extract-features -h for a list of all possible arguments. Note that the CLI provides just the basic extraction functionalities but is probably enough for most users that don't want to dive too deep into various models and modules. If you need more fine-grained control over the extraction itself, we recommend to use the python package directly and write your own Python script.

Python commands

To do this start by importing all the necessary components and instantiating a thingsvision extractor. Here we're using AlexNet from the torchvision library as the model to extract features from and also load the model to GPU for faster inference,

import torch
from thingsvision import get_extractor
from thingsvision.utils.storing import save_features
from thingsvision.utils.data import ImageDataset, DataLoader

model_name = 'alexnet'
source = 'torchvision'
device = 'cuda' if torch.cuda.is_available() else 'cpu'

extractor = get_extractor(
    model_name=model_name,
    source=source,
    device=device,
    pretrained=True
)

As a next step, create both dataset and dataloader for your images. We assume that all of your images are in a single root directory which can contain subfolders (e.g., for individual classes). Therefore, we leverage the ImageDataset class.

root='path/to/root/img/directory' # (e.g., './images/)
batch_size = 32

dataset = ImageDataset(
    root=root,
    out_path='path/to/features',
    backend=extractor.get_backend(), # backend framework of model
    transforms=extractor.get_transformations(resize_dim=256, crop_dim=224) # set input dimensionality to whatever is required for your pretrained model
)

batches = DataLoader(
    dataset=dataset,
    batch_size=batch_size,
    backend=extractor.get_backend() # backend framework of model
)

Now all that is left is to extract the image features and store them to disk! Here we're extracting features from the last convolutional layer of AlexNet (features.10), but if you don't know which modules are available for a given model, just call extractor.show_model() to print all modules.

module_name = 'features.10'

features = extractor.extract_features(
    batches=batches,
    module_name=module_name,
    flatten_acts=True, # flatten 2D feature maps from convolutional layer
    output_type="ndarray", # or "tensor" (only applicable to PyTorch models)
)

save_features(features, out_path='path/to/features', file_format='npy') # file_format can be set to "npy", "txt", "mat", "pt", or "hdf5"

For more examples on the many models available in thingsvision and explanations of additional functionality like how to optionally turn off center cropping, how to use HDF5 datasets (e.g. NSD stimuli), how to perform RSA or CKA, or how to easily extract features for the THINGS image database, please refer to the Documentation.

(back to top)

👋 How to contribute

If you come across problems or have suggestions please submit an issue!

(back to top)

⚠️ License

This GitHub repository is licensed under the MIT License - see the LICENSE.md file for details.

(back to top)

📃 Citation

If you use this GitHub repository (or any modules associated with it), please cite our paper for the initial version of thingsvision as follows:

@article{Muttenthaler_2021,
	author = {Muttenthaler, Lukas and Hebart, Martin N.},
	title = {THINGSvision: A Python Toolbox for Streamlining the Extraction of Activations From Deep Neural Networks},
	journal ={Frontiers in Neuroinformatics},
	volume = {15},
	pages = {45},
	year = {2021},
	url = {https://www.frontiersin.org/article/10.3389/fninf.2021.679838},
	doi = {10.3389/fninf.2021.679838},
	issn = {1662-5196},
}

(back to top)

💎 Contributions

This library is based on the groundwork laid by Lukas Muttenthaler and Martin N. Hebart, who are both still actively involved, but has been extended and refined into its current form with the help of our many contributors,

sorted alphabetically.

This is a joint open-source project between the Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, and the Machine Learning Group at Technische Universtität Berlin. Correspondence and requests for contributing should be adressed to Lukas Muttenthaler. Feel free to contact us if you want to become a contributor or have any suggestions/feedback. For the latter, you could also just post an issue or engange in discussions. We'll try to respond as fast as we can.

(back to top)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].