Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → microsoft → Elevation

microsoft / Elevation

Licence: mit

End-to-end guide design for CRISPR/Cas9 with machine learning

Labels

jupyter-notebook

Projects that are alternatives of or similar to Elevation

Natural Language Object Retrieval

Code release for Hu et al. Natural Language Object Retrieval, in CVPR, 2016

Stars: ✭ 110 (-0.9%)

Mutual labels: jupyter-notebook

A python library for time-series smoothing and outlier detection in a vectorized way.

Stars: ✭ 109 (-1.8%)

Mutual labels: jupyter-notebook

Vision Ai Developer Kit

Vision AI Developer Kit Preview

Stars: ✭ 111 (+0%)

Mutual labels: jupyter-notebook

Adaptive Experimentation Platform

Stars: ✭ 1,663 (+1398.2%)

Mutual labels: jupyter-notebook

Small experiments with attached code

Stars: ✭ 110 (-0.9%)

Mutual labels: jupyter-notebook

Reservoir Engineering

Python worked examples and problems from Reservoir Engineering textbooks (Brian Towler SPE Textbook Vol. 8, etc.)

Stars: ✭ 110 (-0.9%)

Mutual labels: jupyter-notebook

Introduction To Linear Programming

Introduction to Linear Programming with Python

Stars: ✭ 110 (-0.9%)

Mutual labels: jupyter-notebook

Firstcoursenetworkscience

Tutorials, datasets, and other material associated with textbook "A First Course in Network Science" by Menczer, Fortunato & Davis

Stars: ✭ 111 (+0%)

Mutual labels: jupyter-notebook

Python For Data Analytics

This course will teach you only the relevant topics in Python for starting your career in Data Analytics. There are also a bunch of tips and tricks throughout for resume writing, solving case studies, interviews etc. The idea is to help you land a job in analytics and not just teach you Python.

Stars: ✭ 111 (+0%)

Mutual labels: jupyter-notebook

K210基础入门教程 edit by Kyle阿凯

Stars: ✭ 111 (+0%)

Mutual labels: jupyter-notebook

Implementation for <SphereFace: Deep Hypersphere Embedding for Face Recognition> in CVPR'17.

Stars: ✭ 1,483 (+1236.04%)

Mutual labels: jupyter-notebook

Developer API service description and example client code

Stars: ✭ 110 (-0.9%)

Mutual labels: jupyter-notebook

Stars: ✭ 111 (+0%)

Mutual labels: jupyter-notebook

Deeplearning tutorials

The deeplearning algorithms implemented by tensorflow

Stars: ✭ 1,580 (+1323.42%)

Mutual labels: jupyter-notebook

Educational API for developing ML (imitation learning or reinforcement learning) agents to play game 2048

Stars: ✭ 111 (+0%)

Mutual labels: jupyter-notebook

Stars: ✭ 110 (-0.9%)

Mutual labels: jupyter-notebook

May4 challenge exercises

Original versions of the exercises

Stars: ✭ 111 (+0%)

Mutual labels: jupyter-notebook

Deep Learning Language Model

A Code Pattern focusing on how to train a machine learning language model while using Keras and Tensorflow

Stars: ✭ 111 (+0%)

Mutual labels: jupyter-notebook

Deeplearning.ai Pytorch

PyTorch Implementations of Coursera's Deep Learning(deeplearning.ai) Specialization

Stars: ✭ 111 (+0%)

Mutual labels: jupyter-notebook

Introduction To Linear Algebra 5th Edition Ee16a

Stars: ✭ 111 (+0%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

Elevation

Off-target effects of the CRISPR-Cas9 system can lead to suboptimal gene editing outcomes and are a bottleneck in its development. Here, we introduce two interdependent machine learning models for the prediction of off-target effects of CRISPR-Cas9. The approach, which we named Elevation, scores individual guide–target pairs, and aggregates such scores into a single, overall summary guide score.

See our official project page for more detail.

Publications

Please cite this paper if using our predictive model:

Jennifer Listgarten*, Michael Weinstein*, Benjamin P. Kleinstiver, Alexander A. Sousa, J. Keith Joung, Jake Crawford, Kevin Gao, Luong Hoang, Melih Elibol, John G. Doench*, Nicolo Fusi*. Prediction of off-target activities for the end-to-end design of CRISPR guide RNAs. Nature Biomedical Engineering, 2018.

(* = equal contributions/corresponding authors)

Dependencies

Software dependencies

Install Anaconda >= 4.1.1: https://www.continuum.io/downloads

Download and process data dependencies

First, download and process necessary public data files using the CRISPR/download_data.sh script (on Windows, you can run similar commands by hand to the wget commands in the script, or run the script using Bash on Windows/Cygwin/etc).

After running the script, your directory structure should look like:

elevation/
    CRISPR/
        data/
            offtarget/
                Haeussler/
                CD33_data_postfilter.xlsx
                nbt.3117-S2.xlsx
                STable 18 CD33_OffTargetdata.xlsx*
                STable 19 FractionActive_dlfc_lookup.xlsx*
                Supplementary Table 10.xlsx
        gene_sequences/
            CD33_sequence.txt
        guideseq/
            guideseq.py
            guideseq_unique.txt
    cache/
    CHANGELOG.md
    elevation/
    ...

One additional data file must be generated via a genome search, using our Elevation-search (aka "dsNickFury") software, which must be installed separately. The repository is located at https://github.com/michael-weinstein/dsNickFury3PlusOrchid.

Note: If you are not planning to run a genome search for off-targets, you do not need to follow the instructions in the dsNickFury documentation for data dependencies. Here are the steps to follow:
- Download and unzip the hg38 index (linked in the dsNickFury README) into dsNickFury/dsNickFury3PlusOrchid
- Install anaconda2 into dsNickFury/dependencies
- Use anaconda2 to create a Python 3 environment (e.g. dsNickFury/dependencies/anaconda2/bin/conda create -n dsNickFury python==3)
- Edit the dsNickFury/dsNickFury3PlusOrchid/settings.py file so that network_root points to the directory containing dsNickFury, and anaconda_root points to the location of the anaconda2 install
- Edit the CRISPR/guideseq/guideseq.py file so that DSNF_DIRECTORY points to the dsNickFury/dsNickFury3PlusOrchid directory
At this point, you should be able to run CRISPR/guideseq/guideseq.py. (This will take some time to run; ~8 hours on a desktop)

Once the script finishes, there should be a file called guideseq_unique_MM6_end0_lim999999999.hdf5 in the CRISPR/guideseq directory.
Your directory structure should now look something like this:

elevation/
    CRISPR/
        data/
            offtarget/
                Haeussler/
                CD33_data_postfilter.xlsx
                nbt.3117-S2.xlsx
                STable 18 CD33_OffTargetdata.xlsx*
                STable 19 FractionActive_dlfc_lookup.xlsx*
                Supplementary Table 10.xlsx
        gene_sequences/
            CD33_sequence.txt
        guideseq/
            guideseq.py
            guideseq_unique.txt
            guideseq_unique_MM6_end0_lim999999999.hdf5
            ...
    cache/
    CHANGELOG.md
    elevation/
    ...

You can now install the elevation dependencies and run the software.

Install / Develop

Create conda env for elevation: conda create -n elevation python=2.7
Activate conda env:
- (windows) activate elevation
- (linux) source activate elevation
Install Azimuth version 2.0.0: pip install git+https://github.com/MicrosoftResearch/Azimuth.git
Overwrite some of the Azimuth dependencies, since Elevation uses different versions:
- conda install pytables
- conda install scikit-learn==0.18.1
- pip install pandas==0.19.1 (installing these packages via conda/pip avoids recompiling them from source)
Install/Develop elevation:
- To install, python setup.py install
- To develop, python setup.py develop

Test installation

Make sure everything is set up properly by running the following command from the root directory of the repository.

python -m pytest tests or nosetests tests

Use

Guide Sequence Prediction

import elevation.load_data
from elevation.cmds.predict import Predict

# load data
num_x = 100
roc_data, roc_Y_bin, roc_Y_vals = elevation.load_data.load_HauesslerFig2(1)
wildtype = list(roc_data['30mer'])[:num_x]
offtarget = list(roc_data['30mer_mut'])[:num_x]

# initialize predictor
p = Predict()

# run prediction
preds = p.execute(wildtype, offtarget)

# preds is a dictionary of the form {'linear-raw-stacker': [...], 'CFD': [...]}
for i in range(num_x):
    print(wildtype[i], offtarget[i], map(lambda kv: kv[0] + "=" + str(kv[1][i]), preds.iteritems()))

Aggregation Prediction

import numpy as np
import pickle
import elevation.load_data
from elevation.cmds.predict import Predict
from elevation import settings
from elevation import aggregation

# load data
num_x = 100
roc_data, roc_Y_bin, roc_Y_vals = elevation.load_data.load_HauesslerFig2()
wildtype = list(roc_data['30mer'])[:num_x]
offtarget = list(roc_data['30mer_mut'])[:num_x]

# initialize guide seq predictor
p = Predict()

# run prediction
preds = p.execute(wildtype, offtarget)

# load aggregation model
with open(settings.agg_model_file) as fh:
    final_model, other = pickle.load(fh)

# compute aggregated score
isgenic = np.zeros(num_x, dtype=np.bool)
result = aggregation.get_aggregated_score(
         preds['linear-raw-stacker'],
         preds['CFD'],
         isgenic,
         final_model)
print result

Recomputing Models

Models are persisted as pickle files and, under certain circumstances, may need to be recomputed. Elevation models depend on the CRISPR repository. To recompute models, run the following command.

elevation-fit --crispr_repo_dir /home/melih/dev/CRISPR

where /home/melih/dev/CRISPR corresponds to the directory that contains the CRISPR repository you'd like to use to recompute the models.

New Fixtures

After making changes to the models, to generate new fixtures (data used to test prediction consistency), run elevation-fixtures.

Run python -m pytest tests to make sure tests are still passing.

Settings

If you'd like to reconfigure the default location of CRISPR, the temp dir in which pickles are stored, etc., copy elevation/settings_template.py to elevation/settings.py and edit elevation/settings.py before installation. If elevation/settings.py does not exist at install time, then elevation/settings_template.py is used to create elevation/settings.py.

Contacting us

You can submit bug reports using the GitHub issue tracker. If you have any other questions, please contact us at [email protected].

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 111

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (5) 🔗