All Projects → nmichlo → ruck

nmichlo / ruck

Licence: MIT license
🧬 Modularised Evolutionary Algorithms For Python with Optional JIT and Multiprocessing (Ray) support. Inspired by PyTorch Lightning

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to ruck

NSGAII.jl
A NSGA-II implementation in Julia
Stars: ✭ 18 (-64%)
Mutual labels:  nsga-ii, multiobjective-optimization
moead-py
A Python implementation of the decomposition based multi-objective evolutionary algorithm (MOEA/D)
Stars: ✭ 56 (+12%)
Mutual labels:  nsga-ii, multiobjective-optimization
Genetic-Algorithm-for-Job-Shop-Scheduling-and-NSGA-II
Learning how to implement GA and NSGA-II for job shop scheduling problem in python
Stars: ✭ 178 (+256%)
Mutual labels:  nsga-ii, multiobjective-optimization
geneticalgorithm2
Supported highly optimized and flexible genetic algorithm package for python
Stars: ✭ 36 (-28%)
Mutual labels:  evolutionary-algorithms, genetic-algorithms
Optimized-MDVRP
"Using Genetic Algorithms for Multi-depot Vehicle Routing" paper implementation.
Stars: ✭ 30 (-40%)
Mutual labels:  evolutionary-algorithms, genetic-algorithms
vita
Vita - Genetic Programming Framework
Stars: ✭ 24 (-52%)
Mutual labels:  evolutionary-algorithms, genetic-algorithms
Python Concurrency
Code examples from my toptal engineering blog article
Stars: ✭ 131 (+162%)
Mutual labels:  multiprocessing
Vermin
Concurrently detect the minimum Python versions needed to run code
Stars: ✭ 218 (+336%)
Mutual labels:  multiprocessing
Pspider
简单易用的Python爬虫框架,QQ交流群:597510560
Stars: ✭ 1,611 (+3122%)
Mutual labels:  multiprocessing
Zproc
Process on steroids
Stars: ✭ 112 (+124%)
Mutual labels:  multiprocessing
alphafold2-multiprocessing
Use AlphaFold by Deep Mind in Batch Mode + Multiprocessing
Stars: ✭ 22 (-56%)
Mutual labels:  multiprocessing
warp-drive
Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning Framework on a GPU (JMLR 2022)
Stars: ✭ 364 (+628%)
Mutual labels:  numba
Fooproxy
稳健高效的评分制-针对性- IP代理池 + API服务,可以自己插入采集器进行代理IP的爬取,针对你的爬虫的一个或多个目标网站分别生成有效的IP代理数据库,支持MongoDB 4.0 使用 Python3.7(Scored IP proxy pool ,customise proxy data crawler can be added anytime)
Stars: ✭ 195 (+290%)
Mutual labels:  multiprocessing
Lowpolify
Create low-poly art from any image 🌟🌟
Stars: ✭ 149 (+198%)
Mutual labels:  multiprocessing
Multirunner
This is a python package for multi-process running.
Stars: ✭ 242 (+384%)
Mutual labels:  multiprocessing
Axeman
Axeman is a utility to retrieve certificates from Certificate Transparency Lists (CTLs)
Stars: ✭ 125 (+150%)
Mutual labels:  multiprocessing
datafsm
Machine Learning Finite State Machine Models from Data with Genetic Algorithms
Stars: ✭ 14 (-72%)
Mutual labels:  evolutionary-algorithms
Process
An async process dispatcher for Amp.
Stars: ✭ 119 (+138%)
Mutual labels:  multiprocessing
React Native Multithreading
🧵 Fast and easy multithreading for React Native using JSI
Stars: ✭ 164 (+228%)
Mutual labels:  multiprocessing
Pulsar
Event driven concurrent framework for Python
Stars: ✭ 1,867 (+3634%)
Mutual labels:  multiprocessing

🧬 Ruck 🏉

Performant evolutionary algorithms for Python

license python versions pypi version tests status

Visit the examples to get started, or browse the releases.

Contributions are welcome!


Goals

Ruck aims to fill the following criteria:

  1. Provide high quality, readable implementations of algorithms.
  2. Be easily extensible and debuggable.
  3. Performant while maintaining its simplicity.

Features

Ruck has various features that will be expanded upon in time

  • 📦   Modular evolutionary systems inspired by pytorch lightning
    • Helps organise code & arguably looks clean
  • 🎯   Multi-Objective optimisation support
    • Optionally optimised version of NSGA-II if numba is installed, over 65x faster than the DEAP equivalent
  • 🏎   Optional multithreading support with ray, including helper functions
  • 🏭   Factory methods for simple evolutionary algorithms
  • 🧪   Various helper functions for selection, mutation and mating

Citing Ruck

Please use the following citation if you use Ruck in your research:

@Misc{Michlo2021Ruck,
  author =       {Nathan Juraj Michlo},
  title =        {Ruck - Performant evolutionary algorithms for Python},
  howpublished = {Github},
  year =         {2021},
  url =          {https://github.com/nmichlo/ruck}
}

Overview

Ruck takes inspiration from PyTorch Lightning's module system. The population creation, offspring, evaluation and selection steps are all contained within a single module inheriting from EaModule. While the training logic and components are separated into its own class.

Members of a Population (A list of Members) are intended to be read-only. Modifications should not be made to members, instead new members should be created with the modified values. This enables us to easily implement efficient multi-threading, see below!

The trainer automatically constructs HallOfFame and LogBook objects which keep track of your population and offspring. EaModule provides defaults for get_stats_groups and get_progress_stats that can be overridden if you wish to customize the tracked statistics and statistics displayed by tqdm.

Minimal OneMax Example

import random
import numpy as np
from ruck import *


class OneMaxMinimalModule(EaModule):
    """
    Minimal onemax example
    - The goal is to flip all the bits of a boolean array to True
    - Offspring are generated as bit flipped versions of the previous population
    - Selection tournament is performed between the previous population and the offspring
    """

    # evaluate unevaluated members according to their values
    def evaluate_values(self, values):
        return [v.sum() for v in values]

    # generate 300 random members of size 100 with 50% bits flipped
    def gen_starting_values(self):
        return [np.random.random(100) < 0.5 for _ in range(300)]

    # randomly flip 5% of the bits of each each member in the population
    # the previous population members should never be modified
    def generate_offspring(self, population):
        return [Member(m.value ^ (np.random.random(m.value.shape) < 0.05)) for m in population]

    # selection tournament between population and offspring
    def select_population(self, population, offspring):
        combined = population + offspring
        return [max(random.sample(combined, k=3), key=lambda m: m.fitness) for _ in range(len(population))]


if __name__ == '__main__':
    # create and train the population
    module = OneMaxMinimalModule()
    pop, logbook, halloffame = Trainer(generations=100, progress=True).fit(module)

    print('initial stats:', logbook[0])
    print('final stats:', logbook[-1])
    print('best member:', halloffame.members[0])

Advanced OneMax Example

Ruck provides various helper functions and implementations of evolutionary algorithms for convenience. The following example makes use of these additional features so that components and behaviour can easily be swapped out.

The three basic evolutionary algorithms provided are based around deap's default algorithms from deap.algorithms. These basic evolutionary algorithms can be created from ruck.functional.make_ea. We provide the alias ruck.R for ruck.functional. R.make_ea supports the following modes: simple, mu_plus_lambda and mu_comma_lambda.

Code Example

"""
OneMax serial example based on:
https://github.com/DEAP/deap/blob/master/examples/ga/onemax_numpy.py
"""

import functools
import numpy as np
from ruck import *


class OneMaxModule(EaModule):

    def __init__(
        self,
        population_size: int = 300,
        offspring_num: int = None,  # offspring_num (lambda) is automatically set to population_size (mu) when `None`
        member_size: int = 100,
        p_mate: float = 0.5,
        p_mutate: float = 0.5,
        ea_mode: str = 'simple'
    ):
        # save the arguments to the .hparams property. values are taken from the
        # local scope so modifications can be captured if the call to this is delayed.
        self.save_hyperparameters()
        # implement the required functions for `EaModule`
        self.generate_offspring, self.select_population = R.make_ea(
            mode=self.hparams.ea_mode,
            offspring_num=self.hparams.offspring_num,
            mate_fn=R.mate_crossover_1d,
            mutate_fn=functools.partial(R.mutate_flip_bit_groups, p=0.05),
            select_fn=functools.partial(R.select_tournament, k=3),
            p_mate=self.hparams.p_mate,
            p_mutate=self.hparams.p_mutate,
        )

    def evaluate_values(self, values):
        return map(np.sum, values)

    def gen_starting_values(self) -> Population:
        return [
            np.random.random(self.hparams.member_size) < 0.5
            for i in range(self.hparams.population_size)
        ]


if __name__ == '__main__':
    # create and train the population
    module = OneMaxModule(population_size=300, member_size=100)
    pop, logbook, halloffame = Trainer(generations=40, progress=True).fit(module)

    print('initial stats:', logbook[0])
    print('final stats:', logbook[-1])
    print('best member:', halloffame.members[0])

Multithreading OneMax Example (Ray)

If we need to scale up the computational requirements, for example requiring increased member and population sizes, the above serial implementations will soon run into performance problems.

The basic Ruck implementations of various evolutionary algorithms are designed around a map function that can be swapped out to add multi-threading support. We can easily do this using ray and we even provide various helper functions that enhance ray support.

  1. We begin by placing member's values into shared memory using ray's read-only object store and the ray.put function. These ObjectRef's values point to the original np.ndarray values. When retrieved with ray.get they obtain the original arrays using an efficient zero-copy procedure. This is advantageous over something like Python's multiprocessing module which uses expensive pickle operations to pass data around.

  2. The second step is to swap out the aforementioned map function in the previous example to a multiprocessing equivalent. We use ray.remote along with ray.get, and provide the ray_map function that has the same API as python map, but accepts ray.remote(my_fn).remote values instead.

  3. Finally we need to update our mate and mutate functions to handle ObjectRefs, we provide a convenient wrapper to automatically call ray.put on function results so that you can re-use your existing code.

Code Example

"""
OneMax parallel example using ray's object store.

8 bytes * 1_000_000 * 128 members ~= 128 MB of memory to store this population.
This is quite a bit of processing that needs to happen! But using ray
and its object store we can do this efficiently!
"""

from functools import partial
import numpy as np
from ruck import *
from ruck.external.ray import *


class OneMaxRayModule(EaModule):

    def __init__(
        self,
        population_size: int = 300,
        offspring_num: int = None,  # offspring_num (lambda) is automatically set to population_size (mu) when `None`
        member_size: int = 100,
        p_mate: float = 0.5,
        p_mutate: float = 0.5,
        ea_mode: str = 'mu_plus_lambda'
    ):
        self.save_hyperparameters()
        # implement the required functions for `EaModule`
        self.generate_offspring, self.select_population = R.make_ea(
            mode=self.hparams.ea_mode,
            offspring_num=self.hparams.offspring_num,
            # decorate the functions with `ray_remote_put` which automatically
            # `ray.get` arguments that are `ObjectRef`s and `ray.put`s returned results
            mate_fn=ray_remote_puts(R.mate_crossover_1d).remote,
            mutate_fn=ray_remote_put(R.mutate_flip_bit_groups).remote,
            # efficient to compute locally
            select_fn=partial(R.select_tournament, k=3),
            p_mate=self.hparams.p_mate,
            p_mutate=self.hparams.p_mutate,
            # ENABLE multiprocessing
            map_fn=ray_map,
        )
        # eval function, we need to cache it on the class to prevent
        # multiple calls to ray.remote. We use ray.remote instead of
        # ray_remote_put like above because we want the returned values
        # not object refs to those values.
        self._ray_eval = ray.remote(np.mean).remote

    def evaluate_values(self, values):
        # values is a list of `ray.ObjectRef`s not `np.ndarray`s
        # ray_map automatically converts np.sum to a `ray.remote` function which
        # automatically handles `ray.get`ing of `ray.ObjectRef`s passed as arguments
        return ray_map(self._ray_eval, values)

    def gen_starting_values(self):
        # generate objects and place in ray's object store
        return [
            ray.put(np.random.random(self.hparams.member_size) < 0.5)
            for i in range(self.hparams.population_size)
        ]


if __name__ == '__main__':
    # initialize ray to use the specified system resources
    ray.init()

    # create and train the population
    module = OneMaxRayModule(population_size=128, member_size=1_000_000)
    pop, logbook, halloffame = Trainer(generations=200, progress=True).fit(module)

    print('initial stats:', logbook[0])
    print('final stats:', logbook[-1])
    print('best member:', halloffame.members[0])

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].