All Projects → Ensembl → Ensembl Hive

Ensembl / Ensembl Hive

Licence: apache-2.0
EnsEMBL Hive - a system for creating and running pipelines on a distributed compute resource

Programming Languages

python
139335 projects - #7 most used programming language
java
68154 projects - #9 most used programming language
perl
6916 projects

Projects that are alternatives of or similar to Ensembl Hive

Tbls
tbls is a CI-Friendly tool for document a database, written in Go.
Stars: ✭ 940 (+2036.36%)
Mutual labels:  mysql, postgresql, sqlite
Weapsy
ASP.NET Core CMS
Stars: ✭ 748 (+1600%)
Mutual labels:  mysql, postgresql, sqlite
Zxw.framework.netcore
基于EF Core的Code First模式的DotNetCore快速开发框架,其中包括DBContext、IOC组件autofac和AspectCore.Injector、代码生成器(也支持DB First)、基于AspectCore的memcache和Redis缓存组件,以及基于ICanPay的支付库和一些日常用的方法和扩展,比如批量插入、更新、删除以及触发器支持,当然还有demo。欢迎提交各种建议、意见和pr~
Stars: ✭ 691 (+1470.45%)
Mutual labels:  mysql, postgresql, sqlite
Bareos
Main repository with the code for the libraries and daemons
Stars: ✭ 651 (+1379.55%)
Mutual labels:  mysql, postgresql, sqlite
Iobroker.sql
Store history data in SQL Database: MySQL, PostgreSQL or SQLite
Stars: ✭ 37 (-15.91%)
Mutual labels:  mysql, postgresql, sqlite
Sqlancer
Detecting Logic Bugs in DBMS
Stars: ✭ 672 (+1427.27%)
Mutual labels:  mysql, postgresql, sqlite
Vscode Sqltools
Database management for VSCode
Stars: ✭ 741 (+1584.09%)
Mutual labels:  mysql, postgresql, sqlite
Go Sqlbuilder
A flexible and powerful SQL string builder library plus a zero-config ORM.
Stars: ✭ 539 (+1125%)
Mutual labels:  mysql, postgresql, sqlite
Goqu
SQL builder and query library for golang
Stars: ✭ 984 (+2136.36%)
Mutual labels:  mysql, postgresql, sqlite
Bookshelf
A simple Node.js ORM for PostgreSQL, MySQL and SQLite3 built on top of Knex.js
Stars: ✭ 6,252 (+14109.09%)
Mutual labels:  mysql, postgresql, sqlite
Easydb
Easy-to-use PDO wrapper for PHP projects.
Stars: ✭ 624 (+1318.18%)
Mutual labels:  mysql, postgresql, sqlite
Xorm
Simple and Powerful ORM for Go, support mysql,postgres,tidb,sqlite3,mssql,oracle, Moved to https://gitea.com/xorm/xorm
Stars: ✭ 6,464 (+14590.91%)
Mutual labels:  mysql, postgresql, sqlite
Beekeeper Studio
Modern and easy to use SQL client for MySQL, Postgres, SQLite, SQL Server, and more. Linux, MacOS, and Windows.
Stars: ✭ 8,053 (+18202.27%)
Mutual labels:  mysql, postgresql, sqlite
Eosio sql plugin
EOSIO sql database plugin
Stars: ✭ 21 (-52.27%)
Mutual labels:  mysql, postgresql, sqlite
Typeorm
ORM for TypeScript and JavaScript (ES7, ES6, ES5). Supports MySQL, PostgreSQL, MariaDB, SQLite, MS SQL Server, Oracle, SAP Hana, WebSQL databases. Works in NodeJS, Browser, Ionic, Cordova and Electron platforms.
Stars: ✭ 26,559 (+60261.36%)
Mutual labels:  mysql, postgresql, sqlite
Sequelize
An easy-to-use and promise-based multi SQL dialects ORM tool for Node.js
Stars: ✭ 25,422 (+57677.27%)
Mutual labels:  mysql, postgresql, sqlite
Adminer
Database management in a single PHP file
Stars: ✭ 4,999 (+11261.36%)
Mutual labels:  mysql, postgresql, sqlite
Zblogphp
Z-BlogPHP博客程序
Stars: ✭ 527 (+1097.73%)
Mutual labels:  mysql, postgresql, sqlite
Mybb
MyBB is a free and open source forum software.
Stars: ✭ 750 (+1604.55%)
Mutual labels:  mysql, postgresql, sqlite
Smartsql
SmartSql = MyBatis in C# + .NET Core+ Cache(Memory | Redis) + R/W Splitting + PropertyChangedTrack +Dynamic Repository + InvokeSync + Diagnostics
Stars: ✭ 775 (+1661.36%)
Mutual labels:  mysql, postgresql, sqlite

eHive

Travis Build Status Coverage Status Documentation Status codecov Code Climate Docker Build Status

eHive is a system for running computation pipelines on distributed computing resources - clusters, farms or grids.

The name comes from the way pipelines are processed by a swarm of autonomous agents.

Available documentation

The main entry point is available online in the user manual, from where it can be downloaded for offline access.

eHive in a nutshell

Blackboard, Jobs and Workers

In the centre of each running pipeline is a database that acts as a blackboard with individual tasks to be run. These tasks (we call them Jobs) are claimed and processed by "Worker bees" or just Workers - autonomous processes that are continuously running on the compute farm and connect to the pipeline database to report about the progress of Jobs or claim some more. When a Worker discovers that its predefined time is up or that there are no more Jobs to do, it claims no more Jobs and exits the compute farm freeing the resources.

Beekeeper

A separate Beekeeper process makes sure there are always enough Workers on the farm. It regularly checks the states of both the blackboard and the farm and submits more Workers when needed. There is no direct communication between Beekeeper and Workers, which makes the system rather fault-tolerant, as crashing of any of the agents for whatever reason doesn't stop the rest of the system from running.

Analyses

Jobs that share same code, common parameters and resource requirements are typically grouped into Analyses, and generally an Analysis can be viewed as a "base class" for the Jobs that belong to it. However in some sense an Analysis also acts as a "container" for them.

An analysis is implemented as a Runnable file which is a Perl, Python or Java module conforming to a special interface. eHive provides some basic Runnables, especially one that allows running arbitrary commands (programs and scripts written in other languages).

PipeConfig file defines Analyses and dependency rules of the pipeline

eHive pipeline databases are molded according to PipeConfig files which are Perl modules conforming to a special interface. A PipeConfig file defines the stucture of the pipeline, which is a graph whose nodes are Analyses (with their code, parameters and resource requirements) and edges are various dependency rules:

  • Dataflow rules define how data that flows out of an Analysis can be used to trigger creation of Jobs in other Analyses
  • Control rules define dependencies between Analyses as Jobs' containers ("Jobs of Analysis Y can only start when all Jobs of Analysis X are done")
  • Semaphore rules define dependencies between individual Jobs on a more fine-grained level

There are also other parameters of Analyses that control, for example:

  • how many Workers can simultaneously work on a given Analysis,
  • how many times a Job should be tried until it is considered failed,
  • what should be automatically done with a Job if it needs more memory/time, etc.

Grid scheduler and Meadows

eHive has a generic interface named Meadow that describes how to interact with an underlying grid scheduler (submit jobs, query job's status, etc). eHive is compatible with IBM Platform LSF, Sun Grid Engine (now known as Oracle Grid Engine), HTCondor, PBS Pro, Docker Swarm and maybe others. Read more about this on the user manual.

Docker image

We have a Docker image available on the Docker Hub. It can be used to showcase eHive scripts (init_pipeline.pl, beekeeper.pl, runWorker.pl) in a container

Open a session in a new container (will run bash)

docker run -it ensemblorg/ensembl-hive

Initialize and run a pipeline

docker run -it ensemblorg/ensembl-hive init_pipeline.pl Bio::EnsEMBL::Hive::Examples::LongMult::PipeConfig::LongMult_conf -pipeline_url $URL
docker run -it ensemblorg/ensembl-hive beekeeper.pl -url $URL -loop -sleep 0.2
docker run -it ensemblorg/ensembl-hive runWorker.pl -url $URL

Docker Swarm

Once packaged into Docker images, a pipeline can actually be run under the Docker Swarm orchestrator, and thus on any cloud infrastructure that supports it (e.g. Amazon Web Services, Microsoft Azure).

Read more about this on the user manual.

Contact us (mailing list)

eHive was originally conceived and used within EnsEMBL Compara group for running Comparative Genomics pipelines, but since then it has been separated into a separate software tool and is used in many projects both in Genome Campus, Cambridge and outside. There is eHive users' mailing list for questions, suggestions, discussions and announcements.

To subscribe to it please visit http://listserver.ebi.ac.uk/mailman/listinfo/ehive-users

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].