Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. (DEPRECATED)

Stars: ✭ 117 (+234.29%)

Mutual labels: protein-sequences

RamaNet

Preforms De novo protein design using machine learning and PyRosetta to generate a novel protein structure

Stars: ✭ 41 (+17.14%)

Mutual labels: protein-design

reprieve

A library for evaluating representations.

Stars: ✭ 68 (+94.29%)

Mutual labels: representation-learning

pair2vec

pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference

Stars: ✭ 62 (+77.14%)

Mutual labels: representation-learning

PCC-pytorch

A pytorch implementation of the paper "Prediction, Consistency, Curvature: Representation Learning for Locally-Linear Control"

Stars: ✭ 57 (+62.86%)

Mutual labels: representation-learning

ParametricUMAP paper

Parametric UMAP embeddings for representation and semisupervised learning. From the paper "Parametric UMAP: learning embeddings with deep neural networks for representation and semi-supervised learning" (Sainburg, McInnes, Gentner, 2020).

Stars: ✭ 132 (+277.14%)

Mutual labels: representation-learning

REGAL

Representation learning-based graph alignment based on implicit matrix factorization and structural embeddings

Stars: ✭ 78 (+122.86%)

Mutual labels: representation-learning

MTL-AQA

What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment [CVPR 2019]

Stars: ✭ 38 (+8.57%)

Mutual labels: representation-learning

cgdms

Differentiable molecular simulation of proteins with a coarse-grained potential

Stars: ✭ 44 (+25.71%)

Mutual labels: protein

GLOM-TensorFlow

An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data

Stars: ✭ 32 (-8.57%)

Mutual labels: representation-learning

Learning-From-Rules

Implementation of experiments in paper "Learning from Rules Generalizing Labeled Exemplars" to appear in ICLR2020 (https://openreview.net/forum?id=SkeuexBtDr)

Stars: ✭ 46 (+31.43%)

Mutual labels: representation-learning

pia

📚 🔬 PIA - Protein Inference Algorithms

Stars: ✭ 19 (-45.71%)

Mutual labels: protein

FUSION

PyTorch code for NeurIPSW 2020 paper (4th Workshop on Meta-Learning) "Few-Shot Unsupervised Continual Learning through Meta-Examples"

Stars: ✭ 18 (-48.57%)

Mutual labels: representation-learning

TCE

This repository contains the code implementation used in the paper Temporally Coherent Embeddings for Self-Supervised Video Representation Learning (TCE).

Stars: ✭ 51 (+45.71%)

Mutual labels: representation-learning

causal-ml

Must-read papers and resources related to causal inference and machine (deep) learning

Stars: ✭ 387 (+1005.71%)

Mutual labels: representation-learning

EVE

Official repository for the paper "Large-scale clinical interpretation of genetic variants using evolutionary data and deep learning". Joint collaboration between the Marks lab and the OATML group.

Stars: ✭ 37 (+5.71%)

Mutual labels: protein

M-NMF

An implementation of "Community Preserving Network Embedding" (AAAI 2017)

Stars: ✭ 119 (+240%)

Mutual labels: representation-learning

View All Similar Projects ➔

Bio-Benchmarks for Protein Engineering

This repository is for the paper submitted to the 2021 NeurIPS Benchmark track.

Folder breakup

collect_splits contains notebooks to process RAW datasets collected from various sources.
splits contains all splits, a brief description of their processing and the logic behind train/test splits
baselines contains code used to compute baselines

A .gitignored folder called data contains RAW data used to produce all splits. As the folder size is substantial, it could not be shipped with GitHub. However, it can be accessed here: http://data.bioembeddings.com/public/FLIP

Find out more about the splits

The goal of the splits in this repository is to assess how well machine learning devices using protein sequence inputs can represent different dimensions relevant for protein design. The main place to find out about the splits is the splits folder. Each set contains a zip file with one or more "splits", where different splits may be different train/test splits based on biological or statistical intuition.

Split semaphore

Splits are associated with a semaphore which indicates for what they may be used:

🟢: active splits can be used to evaluate accuracy of your machine learning models
🟠: splits that should not be used to make performance comparisons, as may give overestimations, or because other active splits have similar discriminative ability
🔴: splits that should not be used / considered obsolete. Please do not use these to report performance.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

J-SNACKKB / FLIP

Programming Languages

Labels

Projects that are alternatives of or similar to FLIP

Bio-Benchmarks for Protein Engineering

Folder breakup

Find out more about the splits

Split semaphore