All Projects → mlin → Genomicsqlite

mlin / Genomicsqlite

Licence: apache-2.0
Genomics Extension for SQLite

Projects that are alternatives of or similar to Genomicsqlite

Sns
Analysis pipelines for sequencing data
Stars: ✭ 43 (-52.22%)
Mutual labels:  bioinformatics, genomics, sequencing
saffrontree
SaffronTree: Reference free rapid phylogenetic tree construction from raw read data
Stars: ✭ 17 (-81.11%)
Mutual labels:  bioinformatics, genomics, sequencing
Deepvariant
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Stars: ✭ 2,404 (+2571.11%)
Mutual labels:  bioinformatics, genomics, sequencing
Artemis
Artemis is a free genome viewer and annotation tool that allows visualization of sequence features and the results of analyses within the context of the sequence, and its six-frame translation
Stars: ✭ 135 (+50%)
Mutual labels:  bioinformatics, genomics, sequencing
gff3toembl
Converts Prokka GFF3 files to EMBL files for uploading annotated assemblies to EBI
Stars: ✭ 27 (-70%)
Mutual labels:  bioinformatics, genomics, sequencing
Hgvs
Python library to parse, format, validate, normalize, and map sequence variants. `pip install hgvs`
Stars: ✭ 138 (+53.33%)
Mutual labels:  bioinformatics, genomics, sequencing
Gubbins
Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins
Stars: ✭ 67 (-25.56%)
Mutual labels:  bioinformatics, genomics, sequencing
Sequenceserver
Intuitive local web frontend for the BLAST bioinformatics tool
Stars: ✭ 198 (+120%)
Mutual labels:  bioinformatics, genomics, sequencing
Fastq.bio
An interactive web tool for quality control of DNA sequencing data
Stars: ✭ 76 (-15.56%)
Mutual labels:  bioinformatics, genomics, sequencing
plasmidtron
Assembling the cause of phenotypes and genotypes from NGS data
Stars: ✭ 27 (-70%)
Mutual labels:  bioinformatics, genomics, sequencing
Circlator
A tool to circularize genome assemblies
Stars: ✭ 121 (+34.44%)
Mutual labels:  bioinformatics, genomics, sequencing
Awesome Sequencing Tech Papers
A collection of publications on comparison of high-throughput sequencing technologies.
Stars: ✭ 21 (-76.67%)
Mutual labels:  bioinformatics, genomics, sequencing
Genomics
A collection of scripts and notes related to genomics and bioinformatics
Stars: ✭ 101 (+12.22%)
Mutual labels:  bioinformatics, genomics, sequencing
Roary
Rapid large-scale prokaryote pan genome analysis
Stars: ✭ 176 (+95.56%)
Mutual labels:  bioinformatics, genomics, sequencing
Ariba
Antimicrobial Resistance Identification By Assembly
Stars: ✭ 96 (+6.67%)
Mutual labels:  bioinformatics, genomics, sequencing
catch
A package for designing compact and comprehensive capture probe sets.
Stars: ✭ 55 (-38.89%)
Mutual labels:  bioinformatics, genomics, sequencing
Galaxy
Data intensive science for everyone.
Stars: ✭ 812 (+802.22%)
Mutual labels:  bioinformatics, genomics, sequencing
Gatk
Official code repository for GATK versions 4 and up
Stars: ✭ 1,002 (+1013.33%)
Mutual labels:  bioinformatics, genomics, sequencing
Fluent Sqlite Driver
Fluent driver for SQLite
Stars: ✭ 51 (-43.33%)
Mutual labels:  sqlite, sqlite3
Dram
Distilled and Refined Annotation of Metabolism: A tool for the annotation and curation of function for microbial and viral genomes
Stars: ✭ 47 (-47.78%)
Mutual labels:  bioinformatics, genomics

Genomics Extension for SQLite

("GenomicSQLite")

This SQLite3 loadable extension adds features to the ubiquitous embedded RDBMS supporting applications in genome bioinformatics:

  • genomic range indexing for overlap queries & joins
  • in-SQL utility functions, e.g. reverse-complement DNA, parse "chr1:2,345-6,789"
  • automatic streaming storage compression (also available standalone)
  • reading directly from HTTP(S) URLs (also available standalone)
  • pre-tuned settings for "big data"

This October 2020 poster discusses the context and long-run ambitions:

GenomicSQLite Poster

Our Colab notebook demonstrates key features with Python, one of several language bindings.

USE AT YOUR OWN RISK: The extension makes fundamental changes to the database storage layer. While designed to preserve ACID transaction safety, it's young and unlikely to have zero bugs. This project is not associated with the SQLite developers.

Installation & Programming Guide

Start Here 👉 full documentation site

We supply the extension prepackaged for Linux x86-64 and macOS Catalina. An up-to-date version of SQLite itself is also required, as specified in the docs.

Programming language support:

  • C/C++
  • Python ≥3.6
  • Java & JVM languages
  • Rust

More to come. (Help wanted; see Language Bindings Guide)

Building from source

build

Most will prefer to install a pre-built shared library (see above). To build from source, see our Actions yml (Ubuntu 20.04) or Dockerfile (CentOS 7) used to build the more-portable releases. Briefly, you'll need:

  • C++11 build system
  • CMake ≥ 3.14
  • Dev packages: SQLite ≥ 3.31.0, Zstandard ≥ 1.3.4, libcurl

And incantations:

cmake -DCMAKE_BUILD_TYPE=Release -B build .
cmake --build build -j 4 --target genomicsqlite

...generating build/libgenomicsqlite.so. To run the test suite, you'll furthermore need:

  • htslib ≥ 1.9, samtools, and tabix
  • pigz
  • Python ≥ 3.6 and packages: pytest pytest-xdist pre-commit black pylint flake8
  • clang-format & cppcheck

to:

pre-commit run --all-files  # formatters+linters
cmake -DCMAKE_BUILD_TYPE=Debug -B build .
cmake --build build -j 4
env -C build ctest -V
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].