All Projects → broadinstitute → Genomics In The Cloud

broadinstitute / Genomics In The Cloud

Licence: bsd-3-clause
Source code and related materials for the O'Reilly book

Projects that are alternatives of or similar to Genomics In The Cloud

Picanet
Stars: ✭ 35 (-2.78%)
Mutual labels:  jupyter-notebook
Machine Learning Approach For Malware Detection
A Machine Learning approach for classifying a file as Malicious or Legitimate
Stars: ✭ 35 (-2.78%)
Mutual labels:  jupyter-notebook
Deep Learning With Keras
Code repository for Deep Learning with Keras published by Packt
Stars: ✭ 980 (+2622.22%)
Mutual labels:  jupyter-notebook
Product Recommendation With Watson Ml
Build a recommendation engine with Spark and Watson Machine Learning
Stars: ✭ 35 (-2.78%)
Mutual labels:  jupyter-notebook
Pandas Ta
Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators
Stars: ✭ 962 (+2572.22%)
Mutual labels:  jupyter-notebook
Sirisaac
Automated dynamical systems inference
Stars: ✭ 35 (-2.78%)
Mutual labels:  jupyter-notebook
Drugs Recommendation Using Reviews
Analyzing the Drugs Descriptions, conditions, reviews and then recommending it using Deep Learning Models, for each Health Condition of a Patient.
Stars: ✭ 35 (-2.78%)
Mutual labels:  jupyter-notebook
3 body problem bot
Simulations of gravitational interaction of the random n-body systems
Stars: ✭ 36 (+0%)
Mutual labels:  jupyter-notebook
Capsule
DEPRECATED- see https://github.com/OpenMined/OpenMined or https://github.com/OpenMined/PySyft
Stars: ✭ 35 (-2.78%)
Mutual labels:  jupyter-notebook
Projectoxford Clientsdk
This repo has moved 🏠 Visit our website for the latest SDKs & Samples
Stars: ✭ 979 (+2619.44%)
Mutual labels:  jupyter-notebook
Machinelearningcourse
A collection of notebooks of my Machine Learning class written in python 3
Stars: ✭ 35 (-2.78%)
Mutual labels:  jupyter-notebook
Dltemplate
Boilerplate for Deep Learning Projects
Stars: ✭ 35 (-2.78%)
Mutual labels:  jupyter-notebook
Textwritten
Convert text file to handwritten pdf file
Stars: ✭ 36 (+0%)
Mutual labels:  jupyter-notebook
Warp Pytorch
WARP loss for Pytorch as described by the paper: WSABIE: Scaling Up To Large Vocabulary Image Annotation
Stars: ✭ 35 (-2.78%)
Mutual labels:  jupyter-notebook
Octopod
Train multi-task image, text, or ensemble (image + text) models
Stars: ✭ 36 (+0%)
Mutual labels:  jupyter-notebook
Chinese Stock Prediction Using Weibo Baidu News Sentiment
📈 A neural network regression model trained to predict the mean stock price percentage change everyday using financial factors like previous close price, actual previous close price, open price, market capitalization, total market value, price-to-earning ratio and price-to-book ratio, along with corresponding Sina Weibo and Baidu News sentiment scores.
Stars: ✭ 35 (-2.78%)
Mutual labels:  jupyter-notebook
Which animal
Stars: ✭ 35 (-2.78%)
Mutual labels:  jupyter-notebook
Notebooks tutoriais
Aqui você encontrar notebooks para alguns vídeos do meu canal no Youtube
Stars: ✭ 36 (+0%)
Mutual labels:  jupyter-notebook
Treetect
Tree detection
Stars: ✭ 36 (+0%)
Mutual labels:  jupyter-notebook
Natural Language Processing
Stars: ✭ 36 (+0%)
Mutual labels:  jupyter-notebook

genomics-in-the-cloud

Source code and related materials for Genomics in the Cloud, an O'Reilly book by Geraldine A. Van der Auwera and Brian D. O'Connor.

This site is a work in progress, we will continue to add content here now that the book has been released.

Find the electronic version of the book today at https://oreil.ly/genomics-cloud or on Amazon (Kindle version), or pre-order the paperback version on Amazon.

Book overview

Data in the genomics field is booming. In just a few years, organizations such as the National Institutes of Health (NIH) will host 50+ petabytes—or 50 million gigabytes—of genomic data, and they’re turning to cloud infrastructure to make that data available to the research community. How do you adapt analysis tools and protocols to access and analyze that data in the cloud?

With this practical book, researchers will learn how to work with genomics algorithms using open source tools including the Genome Analysis Toolkit (GATK), Docker, WDL, and Terra. Geraldine Van der Auwera, longtime custodian of the GATK user community, and Brian O’Connor of the UC Santa Cruz Genomics Institute guide you through the process. You’ll learn by working with real data and genomics algorithms from the field.

This book takes you through:

  • Essential genomics and computing technology background
  • Basic cloud computing operations
  • Getting started with GATK
  • Three major GATK Best Practices pipelines for variant discovery
  • Automating analysis with scripted workflows using WDL and Cromwell
  • Scaling up workflow execution in the cloud, including parallelization and cost optimization
  • Interactive analysis in the cloud using Jupyter notebooks
  • Secure collaboration and computational reproducibility using Terra

Resources

List of commands

See the commands folder for text files that let you easily copy and paste the commands from the hands-on exercises.

Figures

For those of you reading the print version of the book, which does not include color figures, we've made the figures available in the figures directory of the GCS bucket.
You may use all figures except 3-3 and 6-15 in your own non-commercial work, preferably with a notice of attribution referring to the book. For commercial use, please contact [email protected]. Figures 3-3 and 6-15 do not belong to us, so you must request permission from their respective owners, which are noted in the book.

Blog

We're developing a blog for the book at https://broadinstitute.github.io/genomics-in-the-cloud/ where we will publish blog posts, additional tutorials, errata for the book, and regular updates on new features that you maay be interested in. Feel free to suggest blog topics by reaching out to us on Twitter or LinkedIn (see contact info below).

Reporting errors

If you encounter errors or broken links in the book, please file an issue on O'Reilly's Errata page. Anything reported there that we can verify will get fixed and updated in both the electronic versions and subsequent printing runs of the book, so others won't run into the same problems.

We don't use Github Issues for this project to avoid confusion and redundancy with the O'Reilly Errata page.

Getting help

If you run into problems while working through the hands-on exercises, or if have follow-up questions about the topics we discuss in the book, please post your questions in either the GATK forum or the Terra forum. The frontline support team will most likely be able to address your questions, and for anything else they will loop us into the conversation if you mention that your question is related to our book. If you're not sure which forum to use, just flip a coin; it's the same team that maintains both communities.

Remember also that you can often save yourself some time by searching the GATK documentation or Terra documentation before posting a question -- that way you don't have to wait for someone to get back to you.

Getting in touch with us

If you'd like to get in touch, you can reach us on Twitter (@VdAGeraldine and @boconnor) and on LinkedIn (Geraldine and Brian). We look forward to hearing what you think of the book! If you like it, please consider posting a review on Amazon.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].