All Projects → modernatx → seqlike

modernatx / seqlike

Licence: Apache-2.0 license
Unified biological sequence manipulation in Python

Programming Languages

python
139335 projects - #7 most used programming language
Dockerfile
14818 projects
shell
77523 projects

Projects that are alternatives of or similar to seqlike

perf
PERF is an Exhaustive Repeat Finder
Stars: ✭ 26 (-85.64%)
Mutual labels:  biopython, sequence
cakephp-sequence
CakePHP plugin for maintaining a contiguous sequence of records
Stars: ✭ 41 (-77.35%)
Mutual labels:  sequence
Biopython
Official git repository for Biopython (originally converted from CVS)
Stars: ✭ 2,936 (+1522.1%)
Mutual labels:  biopython
SKAT
Sequence kernel association test (SKAT)
Stars: ✭ 24 (-86.74%)
Mutual labels:  sequence
pyCircos
python Circos
Stars: ✭ 233 (+28.73%)
Mutual labels:  biopython
sequence
高效GUID产生算法(sequence),基于Snowflake实现64位自增ID算法.
Stars: ✭ 35 (-80.66%)
Mutual labels:  sequence
Easysequence
EasySequence is a powerful fundamental library to process sequcence type, such as array, set, dictionary. All type object which conforms to NSFastEnumeration protocol can be initialzed to an EZSequence instance, then you can operation with them. Finally, you can transfer them back to the original type.
Stars: ✭ 150 (-17.13%)
Mutual labels:  sequence
e2-scripts
Scripts for working with electribe 2.
Stars: ✭ 20 (-88.95%)
Mutual labels:  sequence
pydna
Clone with Python! Data structures for double stranded DNA & simulation of homologous recombination, Gibson assembly, cut & paste cloning.
Stars: ✭ 109 (-39.78%)
Mutual labels:  biopython
paxoid
Paxos based masterless ID/Sequence generator.
Stars: ✭ 20 (-88.95%)
Mutual labels:  sequence
bio
A lightweight and high-performance bioinformatics package in Golang
Stars: ✭ 80 (-55.8%)
Mutual labels:  sequence
react-sequence-animator
A React library for sequence animations
Stars: ✭ 23 (-87.29%)
Mutual labels:  sequence
LSTM-Mobility-Model
LSTM Mobility Model implementation using Tensorflow
Stars: ✭ 19 (-89.5%)
Mutual labels:  sequence
sequence-as-promise
Executes array of functions as sequence and returns promise
Stars: ✭ 23 (-87.29%)
Mutual labels:  sequence
vscode-commands
Run commands from Tree View / Status Bar / Quick Pick.
Stars: ✭ 45 (-75.14%)
Mutual labels:  sequence
Sequitur
Library of autoencoders for sequential data
Stars: ✭ 162 (-10.5%)
Mutual labels:  sequence
biopython-coronavirus
Biopython Jupyter Notebook tutorial to characterize a small genome
Stars: ✭ 80 (-55.8%)
Mutual labels:  biopython
StackFlowView
Enforce stack behaviour for custom UI flow.
Stars: ✭ 35 (-80.66%)
Mutual labels:  sequence
tidysq
tidy processing of biological sequences in R
Stars: ✭ 29 (-83.98%)
Mutual labels:  biological-sequences
ddal
DDAL(Distributed Data Access Layer) is a simple solution to access database shard.
Stars: ✭ 33 (-81.77%)
Mutual labels:  sequence

SeqLike - flexible biological sequence objects in Python

PyPI - Supported Python Version PyPI - Package Version Conda - Platform Conda (channel only) Docs - GitHub.io

Introduction

A single object API that makes working with biological sequences in Python more ergonomic. It'll handle anything like a sequence.

Built around the Biopython SeqRecord class, SeqLikes abstract over the semantics of molecular biology (DNA -> RNA -> AA) and data structures (strings, Seqs, SeqRecords, numerical encodings) to allow manipulation of a biological sequence at the level which is most computationally convenient.

Code samples and examples

Build data-type agnostic functions

def f(seq: SeqLikeType, *args):
	seq = SeqLike(seq, seq_type="nt").to_seqrecord()
	# ...

Streamline conversion to/from ML friendly representations

prediction = model(aaSeqLike('MSKGEELFTG').to_onehot())
new_seq = ntSeqLike(generative_model.sample(), alphabet="-ACGTUN")

Interconvert between AA and NT forms of a sequence

Back-translation is conveniently built-in!

s_nt = ntSeqLike("ATGTCTAAAGGTGAA")
s_nt[0:3] # ATG
s_nt.aa()[0:3] # MSK, nt->aa is well defined
s_nt.aa()[0:3].nt() # ATGTCTAAA, works because SeqLike now has both reps
s_nt[:-1].aa() # TypeError, len(s_nt) not a multiple of 3

s_aa = aaSeqLike("MSKGE")
s_aa.nt() # AttributeError, aa->nt is undefined w/o codon map
s_aa = aaSeqLike(s_aa, codon_map=random_codon_map)
s_aa.nt() # now works, backtranslated to e.g. ATGTCTAAAGGTGAA
s_aa[:1].nt() # ATG, codon_map is maintained

Easily plot multiple sequence alignments

seqs = [s for s in SeqIO.parse("file.fasta", "fasta")]
df = pd.DataFrame(
    {
        "names": [s.name for s in seqs],
        "seqs": [aaSeqLike(s) for s in seqs],
    }
)
df["aligned"] = df["seqs"].seq.align()
df["aligned"].seq.plot()

Flexibly build and parse numerical sequence representations

# Assume you have a dataframe with a column of 10 SeqLikes of length 90
df["seqs"].seq.to_onehot().shape # (10, 90, 23), padded if needed

To see more in action, please check out the docs!

Getting Started

Install the library with pip or conda.

With pip

pip install seqlike

With conda

conda install -c conda-forge seqlike

Authors

Support

Contributors

Thanks goes to these wonderful people (emoji key):


Nasos Dousis

💻

andrew giessel

💻

Max Wall

💻 📖

Eric Ma

💻 📖

Mihir Metkar

🤔 💻

Marcus Caron

📖

pagpires

📖

Sugato Ray

🚇 🚧

Damien Farrell

💻

This project follows the all-contributors specification. Contributions of any kind welcome!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].