We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.

✭ 731

python

157. Densepose

A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body

✭ 6,168

Jupyter Notebook python CMake cython Cuda C++

158. Mixup Cifar10

mixup: Beyond Empirical Risk Minimization

✭ 712

python

159. Detectron

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

✭ 24,868

python CMake cython C++matlab Cuda

160. Lama

LAnguage Model Analysis

✭ 693

python

161. Supervision By Registration

Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors

✭ 682

python

162. Wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit

✭ 5,907

C++python shell CMake perl c Dockerfile deep-learning speech-recognition end-to-end wav2letter

163. Fvcore

Collection of common code that's shared among different research projects in FAIR computer vision team.

✭ 643

python

164. Deepsdf

Learning Continuous Signed Distance Functions for Shape Representation

✭ 619

python

165. Imagenet Adversarial Training

ImageNet classifier with state-of-the-art adversarial robustness

✭ 606

python

166. Kill The Bits

Code for: "And the bit goes down: Revisiting the quantization of neural networks"

✭ 606

python

167. Fastmri

A large-scale dataset of both raw MRI measurements and clinical MRI images

✭ 592

python deep-learning convolutional-neural-networks

168. Habitat Lab

A modular high-level library to train embodied AI agents across a variety of tasks, environments, and simulators.

✭ 587

python deep-learning computer-vision reinforcement-learning ai robotics deep-reinforcement-learning research simulator

169. Quaternet

Proposes neural networks that can generate animation of virtual characters for different actions.

✭ 580

jupyter-notebook

170. Mobile Vision

Mobile vision models and code

✭ 580

python

171. Blink

Entity Linker solution

✭ 580

python

172. Craftassist

A virtual assistant bot in Minecraft

✭ 561

tex

173. Adaptive Span

Transformer training code for sequential tasks

✭ 551

python

174. Wsl Images

Weakly Supervised Learning On Images

✭ 548

python

175. Fasttext

Library for fast text representation and classification.

✭ 23,204

HTML C++javascript python CSS shell

176. Fair self supervision benchmark

Scaling and Benchmarking Self-Supervised Visual Representation Learning

✭ 535

python

177. Spanbert

Code for using and evaluating SpanBERT.

✭ 527

python

178. Octconv

Code for paper

✭ 516

python

179. Inversecooking

Recipe Generation from Food Images

✭ 510

python

180. Pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

✭ 5,483

python C++Cuda shell c javascript

181. Classifier Balancing

This repository contains code for the paper "Decoupling Representation and Classifier for Long-Tailed Recognition", published at ICLR 2020

✭ 501

python

182. Torchbeast

A PyTorch Platform for Distributed RL

✭ 498

python

183. Replica Dataset

The Replica Dataset v1 as published in https://arxiv.org/abs/1906.05797 .

✭ 478

184. Synsin

View synthesis for the public.

✭ 476

python

185. Dpr

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

✭ 472

python

186. Mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

✭ 4,713

python deep-learning pytorch dialog pretrained-models vqa captioning multimodal multi-tasking textvqa hateful-memes

187. Kilt

Library for Knowledge Intensive Language Tasks

✭ 456

python

188. Vilbert Multi Task

Multi Task Vision and Language

✭ 421

jupyter-notebook

189. D2go

D2Go is a toolkit for efficient deep learning

✭ 426

python

190. Drqa

Reading Wikipedia to Answer Open-Domain Questions

✭ 4,177

python shell

191. Hydra

Hydra is a framework for elegantly configuring complex applications

✭ 5,207

python javascript ANTLR Jupyter Notebook CSS shell

192. Gtn

Automatic differentiation with weighted finite-state transducers.

✭ 408

193. Denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

✭ 406

python

194. Open lth

A repository in preparation for open-sourcing lottery ticket hypothesis code.

✭ 390

python

195. Tabert

This repository contains source code for the TaBERT model, a pre-trained language model for learning joint representations of natural language utterances and (semi-)structured tables for semantic parsing. TaBERT is pre-trained on a massive corpus of 26M Web tables and their associated natural language context, and could be used as a drop-in replacement of a semantic parsers original encoder to compute representations for utterances and table schemas (columns).

✭ 390

python

196. Music Translation

A UNIVERSAL MUSIC TRANSLATION NETWORK - a method for translating music across musical instruments and styles.

✭ 385

cuda

197. Adaptive Softmax

Implements an efficient softmax approximation as described in the paper "Efficient softmax approximation for GPUs" (http://arxiv.org/abs/1609.04309)

✭ 383

lua

198. Nle

The NetHack Learning Environment