All Projects → JonathanShor → Doubletdetection

JonathanShor / Doubletdetection

Licence: mit
Doublet detection in single-cell RNA-seq data.

Projects that are alternatives of or similar to Doubletdetection

Unimelb Data Science
All my Lecture Notes, Assignments and Past Exam material.
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Fashion Tag
Baseline of FashionAI Competition based on Keras.
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Juliatutorial
Julia Tutorial for Finance and Econometrics Students
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Meetup
밋업 자료
Stars: ✭ 49 (-2%)
Mutual labels:  jupyter-notebook
Estid Sig
Verify Estonian e-id signatures on Ethereum
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Anomaly detection
This is a times series anomaly detection algorithm, implemented in Python, for catching multiple anomalies. It uses a moving average with an extreme student deviate (ESD) test to detect anomalous points.
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Teaching Ml In Production
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Sketchback
Keras implementation of sketch inversion using deep convolution neural networks (synthesising photo-realistic images from pencil sketches)
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
O Que 15 Mil Tweets Revelam Sobre Seu Candidato
Código e dados para a matéria "O que 15 mil tweets revelam sobre seu candidato" || Code and data for the story "What 15k tweets show about your candidate"
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Community
Kubernetes community content
Stars: ✭ 9,133 (+18166%)
Mutual labels:  jupyter-notebook
K Anonymity
Anonymization methods for network security.
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Machine learning economics
Machine Learning for Economics
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Livelossplot
Live training loss plot in Jupyter Notebook for Keras, PyTorch and others
Stars: ✭ 1,050 (+2000%)
Mutual labels:  jupyter-notebook
Ncar Python Tutorial
Numerical & Scientific Computing with Python Tutorial
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
How to generate images with tensorflow live
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Octave
Musical data transmission
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Matminer examples
A repo of examples for the matminer (https://github.com/hackingmaterials/matminer) code
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Siraj chatbot challenge
Entry for machine learning
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Scona
Code to analyse structural covariance brain networks using python.
Stars: ✭ 50 (+0%)
Mutual labels:  jupyter-notebook
Numerical Linear Algebra
Free online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course
Stars: ✭ 8,263 (+16426%)
Mutual labels:  jupyter-notebook

DoubletDetection

DOI Documentation Status Code style: black Build Status

DoubletDetection is a Python3 package to detect doublets (technical errors) in single-cell RNA-seq count matrices.

Installing DoubletDetection

Install from PyPI

pip install doubletdetection

Install from source

git clone https://github.com/JonathanShor/DoubletDetection.git
cd DoubletDetection
pip3 install .

If you are using pipenv as your virtual environment, it may struggle installing from the setup.py due to our custom Phenograph requirement. If so, try the following in the cloned repo:

pipenv run pip3 install .

Running DoubletDetection

To run basic doublet classification:

import doubletdetection
clf = doubletdetection.BoostClassifier()
# raw_counts is a cells by genes count matrix
labels = clf.fit(raw_counts).predict()
# higher means more likely to be doublet
scores = clf.doublet_score()
  • raw_counts is a scRNA-seq count matrix (cells by genes), and is array-like
  • labels is a 1-dimensional numpy ndarray with the value 1 representing a detected doublet, 0 a singlet, and np.nan an ambiguous cell.
  • scores is a 1-dimensional numpy ndarray representing a score for how likely a cell is to be a doublet. The score is used to create the labels.

The classifier works best when

  • There are several cell types present in the data
  • It is applied individually to each run in an aggregated count matrix

In v2.5 we have added a new experimental clustering method (scanpy's Louvain clustering) that is much faster than phenograph. We are still validating results from this new clustering. Please see the notebook below for an example of using this new feature.

See our jupyter notebook for an example on 8k PBMCs from 10x.

Obtaining data

Data can be downloaded from the 10x website.

Credits and citations

Gayoso, Adam, Shor, Jonathan, Carr, Ambrose J., Sharma, Roshan, Pe'er, Dana (2020, December 18). DoubletDetection (Version v3.0). Zenodo. http://doi.org/10.5281/zenodo.2678041

We also thank the participants of the 1st Human Cell Atlas Jamboree, Chun J. Ye for providing data useful in developing this method, and Itsik Pe'er for providing guidance in early development as part of the Computational genomics class at Columbia University.

This project is licensed under the terms of the MIT license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].