All Projects → bebatut → enasearch

bebatut / enasearch

Licence: MIT License
A Python library for interacting with ENA's API

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects
TeX
3793 projects

Projects that are alternatives of or similar to enasearch

CompaniesHouse.NET
A simple .NET client wrapper for CompaniesHouse API
Stars: ✭ 28 (+64.71%)
Mutual labels:  api-client
messaging-apis
Messaging APIs for multi-platform
Stars: ✭ 1,759 (+10247.06%)
Mutual labels:  api-client
jobs-stackoverflow
Making it easy to integrate with the Stack Overflow job board API
Stars: ✭ 17 (+0%)
Mutual labels:  api-client
dns
dns is a simple CLI tool for DNS-LG API
Stars: ✭ 28 (+64.71%)
Mutual labels:  api-client
js-http-client
[DEPRECATED] Official Textile JS HTTP Wrapper Client
Stars: ✭ 29 (+70.59%)
Mutual labels:  api-client
laravel-quickbooks-client
SPINEN's Laravel Client for QuickBooks.
Stars: ✭ 25 (+47.06%)
Mutual labels:  api-client
pushover
Go wrapper for the Pushover API
Stars: ✭ 112 (+558.82%)
Mutual labels:  api-client
dataiku-api-client-python
Python client for the DSS public API
Stars: ✭ 32 (+88.24%)
Mutual labels:  api-client
hcloud-rust
Unofficial Rust crate for accessing the Hetzner Cloud API
Stars: ✭ 22 (+29.41%)
Mutual labels:  api-client
pastebin-csharp
API client for Pastebin in C#
Stars: ✭ 25 (+47.06%)
Mutual labels:  api-client
tweetsOLAPing
implementing an end-to-end tweets ETL/Analysis pipeline.
Stars: ✭ 24 (+41.18%)
Mutual labels:  api-client
pychannels
Python library for querying and controlling the Channels app.
Stars: ✭ 15 (-11.76%)
Mutual labels:  api-client
Bittrex.Api.Client
A C# http client wrapper for the Bittrex cryptocurrency trading platform api
Stars: ✭ 14 (-17.65%)
Mutual labels:  api-client
downcloud
Download your own Soundcloud tracks (uncompressed)
Stars: ✭ 22 (+29.41%)
Mutual labels:  api-client
eoLinker
在线 API 研发管理测试工具,最后能用的开源修复版本(4.0.1本地测试插件兼容3.5与4.0版本)。
Stars: ✭ 62 (+264.71%)
Mutual labels:  api-client
pocket-api
A python wrapper around GetPocket API V3.
Stars: ✭ 103 (+505.88%)
Mutual labels:  api-client
revolut-php
💳 PHP Bindings for the Revolut Business API
Stars: ✭ 37 (+117.65%)
Mutual labels:  api-client
square-java-sdk
Java client library for the Square API
Stars: ✭ 39 (+129.41%)
Mutual labels:  api-client
postmates-api
PHP API Client for Posmates
Stars: ✭ 16 (-5.88%)
Mutual labels:  api-client
j2ssh-maverick
The open source branch of our legacy API providing a robust, mission critical SSH component to the community.
Stars: ✭ 57 (+235.29%)
Mutual labels:  api-client

ENASearch

https://travis-ci.org/bebatut/enasearch.svg?branch=master Code Health

ENASearch is a Python library for interacting with ENA's API.

Context

The European Nucleotide Archive (ENA) is a database with a comprehensive record of nucleotide sequencing information (raw sequencing data, sequence assembly information and functional annotation). The data contained in ENA can be accessed manually or programmatically via REST URLs. However, building HTTP-based REST requests is not always straightforward - a user friendly, high-level access is needed to make it easier to interact with ENA programmatically.

We developed ENASearch, a Python library to search and retrieve data from ENA database. It also allows for rich querying support by accessing different fields, filters or functions offered by ENA. ENASearch can be used as a Python package, through a command-line interface or inside Galaxy.

Usage

ENASearch can be used via command-line:

$ enasearch --help
Usage: enasearch [OPTIONS] COMMAND [ARGS]...

  The Python library for interacting with ENA's API

Options:
  --version   Show the version and exit.
  -h, --help  Show this message and exit.

Commands:
  get_analysis_fields       Get the fields extractable for an analysis.
  get_display_options       Get the list of possible formats to display...
  get_download_options      Get the options for download of data from...
  get_filter_fields         Get the filter fields of a result to build a...
  get_filter_types          Return the filters usable for the different...
  get_results               Get the possible results (type of data).
  get_returnable_fields     Get the fields extractable for a result.
  get_run_fields            Get the fields extractable for a run.
  get_sortable_fields       Get the fields of a result that can sorted.
  get_taxonomy_results      Get list of taxonomy results.
  retrieve_analysis_report  Retrieve analysis report from ENA.
  retrieve_data             Retrieve ENA data (other than taxon).
  retrieve_run_report       Retrieve run report from ENA.
  retrieve_taxons           Retrieve data from the ENA Taxon Portal.
  search_data               Search data given a query.

$ enasearch search_data --help
Usage: enasearch search_data [OPTIONS]

  Search data given a query.

  This function

  - Extracts the number of possible results for the query - Extracts the all
  the results of the query (by potentially running several times the search
  function)

  The output can be redirected to a file and directly display to the
  standard output given the display chosen.

Options:
  --free_text_search      Use free text search, otherwise the data warehouse
                          is used
  --query TEXT            Query string, made up of filtering conditions,
                          joined by logical ANDs, ORs and NOTs and bound by
                          double quotes; the filter fields for a query are
                          accessible with get_filter_fields and the type of
                          filters with get_filter_types  [required]
  --result TEXT           Id of a result (accessible with get_results)
                          [required]
  --display TEXT          Display option to specify the display format
                          (accessible with get_display_options)  [required]
  --download TEXT         Download option to specify that records are to be
                          saved in a file (used with file option, list
                          accessible with get_download_options)
  --file PATH             File to save the content of the search (used with
                          download option)
  --fields TEXT           Fields to return (accessible with
                          get_returnable_fields, used only for report as
                          display value) [multiple or comma-separated]
  --sortfields TEXT       Fields to sort the results (accessible with
                          get_sortable_fields, used only for report as display
                          value) [multiple or comma-separated]
  --offset INTEGER RANGE  First record to get (used only for display different
                          of fasta and fastq
  --length INTEGER RANGE  Number of records to retrieve (used only for display
                          different of fasta and fastq
  -h, --help              Show this message and exit.

It can also be used as a Python library:

>>> import enasearch
>>> enasearch.retrieve_data(
        ids="A00145",
        display="fasta",
        download=None,
        file=None,
        offset=0,
        length=100000,
        subseq_range="3-63",
        expanded=None,
        header=None)
[SeqRecord(seq=Seq('GAAGGAAGGTCTTCAGAGAACCTAGAGAGCAGGTTCACAGAGTCACCCACCTCA...GCC', SingleLetterAlphabet()), id='ENA|A00145|A00145.1', name='ENA|A00145|A00145.1', description='ENA|A00145|A00145.1 B.taurus BoIFN-alpha A mRNA : Location:3..63', dbxrefs=[])]

The information extracted from ENA can be in several formats: HTML, Text, XML, FASTA, FASTQ, ... XML outputs are transformed in a Python dictionary using xmltodict and the FASTA and FASTQ into SeqRecord objects using BioPython.

Installation

ENASearch can be installed with pip:

$ pip install enasearch

or with conda:

$ conda install -c bioconda enasearch

Tests

ENASearch comes with tests:

$ make test

These tests are automatically run on TravisCI for each Pull Request.

Documentation

Documentation about ENASearch is available online at http://bebatut.fr/enasearch

To update it:

  1. Make the changes in src/docs
  2. Generate the doc with
$ make doc
  1. Check it by opening the docs/index.html file in a web browser
  2. Propose the changes via a Pull Request

Generate the data descriptions

To run, ENASearch needs some data from ENA to describe how to query ENA. Currently, such information is manually extracted into CSV files in the data directory. Python objects are generated from these CSV files with

$ python src/serialize_ena_data_descriptors.py
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].