All Projects → cognoma → cancer-data

cognoma / cancer-data

Licence: Unknown, CC0-1.0 licenses found Licenses found Unknown LICENSE-BSD.md CC0-1.0 LICENSE-CC0.md
TCGA data acquisition and processing for Project Cognoma

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to cancer-data

smartas
📓Notebook of Climente-González et al. (2017), The Functional Impact of Alternative Splicing in Cancer.
Stars: ✭ 13 (-23.53%)
Mutual labels:  cancer, tcga
mageri
MAGERI - Assemble, align and call variants for targeted genome re-sequencing with unique molecular identifiers
Stars: ✭ 19 (+11.76%)
Mutual labels:  cancer, mutation
GenomicDataCommons
Provide R access to the NCI Genomic Data Commons portal.
Stars: ✭ 64 (+276.47%)
Mutual labels:  cancer, tcga
psichomics
Interactive R package to quantify, analyse and visualise alternative splicing
Stars: ✭ 26 (+52.94%)
Mutual labels:  tcga, gene-expression
RNAseq titration results
Cross-platform normalization enables machine learning model training on microarray and RNA-seq data simultaneously
Stars: ✭ 22 (+29.41%)
Mutual labels:  cancer, gene-expression
datapackage-m
Power Query M functions for working with Tabular Data Packages (Frictionless Data) in Power BI and Excel
Stars: ✭ 26 (+52.94%)
Mutual labels:  data-acquisition
SlicerRadiomics
A Slicer extension to provide a GUI around pyradiomics
Stars: ✭ 83 (+388.24%)
Mutual labels:  cancer
superacao-app
Aplicativo para o projeto "Anjos do SuperAção"
Stars: ✭ 17 (+0%)
Mutual labels:  cancer
strapi-graphql-documentation
Collections of queries and mutations that hopefully help you in a Strapi project powered by GraphQL API 🚀
Stars: ✭ 45 (+164.71%)
Mutual labels:  mutation
histopathologic cancer detector
CNN histopathologic tumor identifier.
Stars: ✭ 26 (+52.94%)
Mutual labels:  cancer
CSGO-Offset-Scanner
Java Based Cross-Platform CSGO Offset and Netvar Scanner
Stars: ✭ 28 (+64.71%)
Mutual labels:  xena
dorothea
R package to access DoRothEA's regulons
Stars: ✭ 98 (+476.47%)
Mutual labels:  gene-expression
arriba
Fast and accurate gene fusion detection from RNA-Seq data
Stars: ✭ 162 (+852.94%)
Mutual labels:  cancer
cacao
Callable Cancer Loci - assessment of sequencing coverage for actionable and pathogenic loci in cancer
Stars: ✭ 21 (+23.53%)
Mutual labels:  cancer
LabJackPython
The official Python modules and classes for interacting with the LabJack U3, U6, UE9 and U12
Stars: ✭ 100 (+488.24%)
Mutual labels:  data-acquisition
SCope
Fast visualization tool for large-scale and high dimensional single-cell data
Stars: ✭ 62 (+264.71%)
Mutual labels:  gene-expression
universalmutator
Regexp based tool for mutating generic source code across numerous languages
Stars: ✭ 105 (+517.65%)
Mutual labels:  mutation
EvOLuTIoN
A simple simulation in Unity, which uses genetic algorithm to optimize forces applied to cubes
Stars: ✭ 44 (+158.82%)
Mutual labels:  mutation
ci4cc-informatics-resources
Community-maintained list of resources that the CI4CC organization and the larger cancer informatics community have found useful or are developing.
Stars: ✭ 22 (+29.41%)
Mutual labels:  cancer
FPGA Ultrasound
CMU 18545 FPGA project -- Multi-channel ultrasound data acquisition and beamforming system.
Stars: ✭ 39 (+129.41%)
Mutual labels:  data-acquisition

Cancer data acquisition and processing for Project Cognoma

This is a mixed notebook and data repository for retrieving cancer data for Project Cognoma. Currently, all data is from the TCGA Pan-Cancer collection of the UCSC Xena Browser.

Workflow

The data acquisition and analysis is executing by running Jupyter notebooks in the following order:

The execute.sh script executes the notebooks in order. After installing and activating the environment, run with the command bash execute.sh from the repository's root directory.

Directories

The repository contains the following directories:

  • download — contains files retrieved from an external location whose content is unmodified. Large downloaded files are tracked using Git LFS. Associated metadata files are also retained for versioning.
  • data — contains generated datasets. The complete matrix files are not currently tracked due to file size, but randomly-subsetted versions are available for development use (see data/subset).

Download

DOI: 10.6084/m9.figshare.3487685

The complete datasets created by this repository (data/expression-matrix.tsv.bz2 and data/mutation-matrix.tsv.bz2) are uploaded to figshare. Since this is a manual process, check the figshare REFERENCES section to see which commit these datasets derive from. In other words, the latest version on figshare may lag behind this repository.

Environment

This repository uses conda to manage its environment, which is named cognoma-cancer-data. The required packages and versions are listed in environment.yml. If as a developer, you require an additional package, add it to environment.yml.

The following commands install and activate the environment:

# Create or overwrite the cognoma-cancer-data conda environment
conda env create --file=environment.yml

# Activate the conda environment (assumes conda >= 4.4)
conda activate cognoma-cancer-data

License

This repository is dual licensed as BSD 3-Clause and CC0 1.0, meaning any repository content can be used under either license. This licensing arrangement ensures source code is available under an OSI-approved License, while non-code content — such as figures, data, and documentation — is maximally reusable under a public domain dedication.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].