All Projects → chrovis → cljam

chrovis / cljam

Licence: Apache-2.0 license
A DNA Sequence Alignment/Map (SAM) library for Clojure

Programming Languages

clojure
4091 projects

Projects that are alternatives of or similar to cljam

fuc
Frequently used commands in bioinformatics
Stars: ✭ 23 (-72.94%)
Mutual labels:  sam, vcf, fasta, bed, bam, fastq, gff
bioinf-commons
Bioinformatics library in Kotlin
Stars: ✭ 21 (-75.29%)
Mutual labels:  fasta, bed, bam, fastq, 2bit
bioSyntax-archive
Syntax highlighting for computational biology
Stars: ✭ 16 (-81.18%)
Mutual labels:  sam, vcf, fasta, bam
hts-python
pythonic wrapper for libhts (moved to: https://github.com/quinlan-lab/hts-python)
Stars: ✭ 48 (-43.53%)
Mutual labels:  genomics, sam, fasta, bam
hts-python
pythonic wrapper for htslib
Stars: ✭ 18 (-78.82%)
Mutual labels:  genomics, sam, bam
simplesam
Simple pure Python SAM parser and objects for working with SAM records
Stars: ✭ 50 (-41.18%)
Mutual labels:  genomics, sam, bam
Genozip
Compressor for genomic files (FASTQ, SAM/BAM, VCF, FASTA, GVF, 23andMe...), up to 5x better than gzip and faster too
Stars: ✭ 53 (-37.65%)
Mutual labels:  genomics, sam, vcf
bin
My bioinfo toolbox
Stars: ✭ 42 (-50.59%)
Mutual labels:  sam, bam, fastq
pheniqs
Fast and accurate sequence demultiplexing
Stars: ✭ 14 (-83.53%)
Mutual labels:  sam, bam, fastq
Htsjdk
A Java API for high-throughput sequencing data (HTS) formats.
Stars: ✭ 220 (+158.82%)
Mutual labels:  genomics, sam, vcf
indelope
find large indels (in the blind spot between GATK/freebayes and SV callers)
Stars: ✭ 38 (-55.29%)
Mutual labels:  genomics, vcf
fq
Command line utility for manipulating Illumina-generated FastQ files.
Stars: ✭ 31 (-63.53%)
Mutual labels:  genomics, fastq
perf
PERF is an Exhaustive Repeat Finder
Stars: ✭ 26 (-69.41%)
Mutual labels:  genomics, fasta
Cyvcf2
cython + htslib == fast VCF and BCF processing
Stars: ✭ 243 (+185.88%)
Mutual labels:  genomics, vcf
spark-vcf
Spark VCF data source implementation for Dataframes
Stars: ✭ 15 (-82.35%)
Mutual labels:  genomics, vcf
redundans
Redundans is a pipeline that assists an assembly of heterozygous/polymorphic genomes.
Stars: ✭ 90 (+5.88%)
Mutual labels:  genomics, fasta
Ontologies
Home of the Genomic Feature and Variation Ontology (GFVO)
Stars: ✭ 16 (-81.18%)
Mutual labels:  genomics, vcf
Vcfanno
annotate a VCF with other VCFs/BEDs/tabixed files
Stars: ✭ 259 (+204.71%)
Mutual labels:  genomics, vcf
Pygeno
Personalized Genomics and Proteomics. Main diet: Ensembl, side dishes: SNPs
Stars: ✭ 261 (+207.06%)
Mutual labels:  genomics, vcf
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-69.41%)
Mutual labels:  genomics, vcf

cljam

A DNA Sequence Alignment/Map (SAM) library for Clojure. [API Reference] [Annotated Source]

Clojars Project

Build Status

codecov

Installation

cljam is available as a Maven artifact from Clojars.

Clojure CLI/deps.edn:

cljam {:mvn/version "0.8.4"}

Leiningen/Boot:

[cljam "0.8.4"]

Breaking changes in 0.8.0

  • cljam.io.tabix is rewritten. #180
  • cljam.io.bam-index.writer/pos->lidx-offset is moved to cljam.io.util.bin/pos->lidx-offset. #180
  • cljam.io.sam.util/reg->bin is moved to cljam.io.util.bin/reg->bin. Also, a coordinate system of its argument is changed from 0-based half-open to 1-based fully-closed. #190

Getting started

To read a SAM/BAM format file,

(require '[cljam.io.sam :as sam])

;; Open a file
(with-open [r (sam/reader "path/to/file.bam")]
  ;; Retrieve header
  (sam/read-header r)
  ;; Retrieve alignments
  (doall (take 5 (sam/read-alignments r))))

To create a sorted file,

(require '[cljam.io.sam :as sam]
         '[cljam.algo.sorter :as sorter])

(with-open [r (sam/reader "path/to/file.bam")
            w (sam/writer "path/to/sorted.bam")]
  ;; Sort by chromosomal coordinates
  (sorter/sort-by-pos r w))

To create a BAM index file,

(require '[cljam.algo.bam-indexer :as bai])

;; Create a new BAM index file
(bai/create-index "path/to/sorted.bam" "path/to/sorted.bam.bai")

To calculate coverage depth for a BAM file,

(require '[cljam.io.sam :as sam]
         '[cljam.algo.depth :as depth])

(with-open [r (sam/reader "path/to/sorted.bam")]
  ;; Pileup "chr1" alignments
  (depth/depth r {:chr "chr1", :start 1, :end 10}))
;;=> (0 0 0 0 0 0 1 1 3 3)

If you are Clojure beginner, read Getting Started for Clojure Beginners.

Command-line tool

cljam provides a command-line tool to use the features easily.

Executable installation

lein bin creates standalone console executable into target directory.

$ lein bin
Creating standalone executable: /path/to/cljam/target/cljam

Copy the executable cljam somewhere in your $PATH.

Usage

All commands are displayed by cljam -h, and detailed help for each command are displayed by cljam [cmd] -h.

$ cljam view -h

For example, to display contents of a SAM file including the header,

$ cljam view --header path/to/file.sam

See command-line tool manual for more information.

Development

Test

To run tests,

  • lein test for basic tests,
  • lein test :slow for slow tests with local resources,
  • lein test :remote for tests with remote resources.

To get coverage

$ lein cloverage

And open target/coverage/index.html.

Generating document

cljam uses Codox for API reference and Marginalia for annotated source code.

$ lein docs

generates these documents in target/docs and target/literate directories.

Citing cljam

T. Takeuchi, A. Yamada, T. Aoki, and K. Nishimura. cljam: a library for handling DNA sequence alignment/map (SAM) with parallel processing. Source Code for Biology and Medicine, Vol. 11, No. 1, pp. 1-4, 2016.

Contributors

Sorted by first commit.

License

Copyright 2013-2023 Xcoo, Inc.

Licensed under the Apache License, Version 2.0.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].