All Projects → berhane → LAP-solvers

berhane / LAP-solvers

Licence: other
Benchmarking Linear Assignment Problem Solvers

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to LAP-solvers

Hungarian.jl
The Hungarian(Kuhn-Munkres) algorithm for Julia
Stars: ✭ 32 (-53.62%)
Mutual labels:  hungarian-algorithm, munkres
elm-benchmark
Benchmarking for Elm
Stars: ✭ 48 (-30.43%)
Mutual labels:  benchmarking
forest-benchmarking
A library for quantum characterization, verification, validation (QCVV), and benchmarking using pyQuil.
Stars: ✭ 41 (-40.58%)
Mutual labels:  benchmarking
benchmark-trend
Measure performance trends of Ruby code
Stars: ✭ 60 (-13.04%)
Mutual labels:  benchmarking
mzbench
Distributed Benchmarking
Stars: ✭ 39 (-43.48%)
Mutual labels:  benchmarking
reframe
A powerful Python framework for writing and running portable regression tests and benchmarks for HPC systems.
Stars: ✭ 154 (+123.19%)
Mutual labels:  benchmarking
PiBenchmarks
Raspberry Pi benchmarking scripts featuring a storage benchmark with score
Stars: ✭ 69 (+0%)
Mutual labels:  benchmarking
prettyBenching
🦕 A small lib to make your Deno benchmarking progress and results look pretty
Stars: ✭ 23 (-66.67%)
Mutual labels:  benchmarking
immuneML
immuneML is a platform for machine learning analysis of adaptive immune receptor repertoire data.
Stars: ✭ 41 (-40.58%)
Mutual labels:  benchmarking
perf check
PERRRFFF CHERRRRK!
Stars: ✭ 16 (-76.81%)
Mutual labels:  benchmarking
beapi-bench
Tool for benchmarking apis. Uses ApacheBench(ab) to generate data and gnuplot for graphing. Adding new features almost daily
Stars: ✭ 16 (-76.81%)
Mutual labels:  benchmarking
graphsim
R package: Simulate Expression data from igraph network using mvtnorm (CRAN; JOSS)
Stars: ✭ 16 (-76.81%)
Mutual labels:  benchmarking
grandma
👵 fully programmable stress testing framework
Stars: ✭ 20 (-71.01%)
Mutual labels:  benchmarking
ezab
A suite of tools for benchmarking (load testing) web servers and databases
Stars: ✭ 16 (-76.81%)
Mutual labels:  benchmarking
bench
⏱️ Reliable performance measurement for Go programs. All in one design.
Stars: ✭ 33 (-52.17%)
Mutual labels:  benchmarking
EDTA
Extensive de-novo TE Annotator
Stars: ✭ 210 (+204.35%)
Mutual labels:  benchmarking
ake-datasets
Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.
Stars: ✭ 125 (+81.16%)
Mutual labels:  benchmarking
load-testing-toolkit
Collection of open-source tools for debugging, benchmarking, load and stress testing your code or services.
Stars: ✭ 65 (-5.8%)
Mutual labels:  benchmarking
AMBER
AMBER: Assessment of Metagenome BinnERs
Stars: ✭ 18 (-73.91%)
Mutual labels:  benchmarking
benchee html
Draw pretty micro benchmarking charts in HTML and allow to export them as png for benchee
Stars: ✭ 50 (-27.54%)
Mutual labels:  benchmarking

Purpose

The script benchmarks the performance of Python3 linear assignment problem solvers for random cost matrices of different sizes. These solvers are:

They all formally have O(n3) complexity, but their performance differs substantially based on their implementation and the size of the matrix they are trying to solve. The solvers can be classified based on some unique characteristics.

Module Python or C/C++/Cython Algorithm
scipy.optimize.linear_sum_assignment Python(<v1.4)/C++(=>v1.4)) Hungarian
munkres.Munkres Python Hungarian
laptools.clap Python ?
hungarian.lap C++ Hungarian
lap.lapjv C++ Jonker-Volgenant
lapjv.lapjv C++ Jonker-Volgenant
lapsolver.solve_dense C++ shortest augmenting path

The purpose of this benchmarking exercise is to see which implementation performs best for a given matrix size. My interest is to use this information to improve the performance of Arbalign and expand its use.

Contents

The repo contains the following:

  • benchmark-lap-solvers.py - a Python3 script comparing four/six implementations
  • benchmark-lap-solvers-py3.ipynb - a Jupyter notebook comparing four/six implementations. It has been tested using Python 3.6 and 3.7.

Usage

It's simple once you have installed the necessary packages.

Usage: benchmark-lap-solvers.py [-h] [-c] [-v] [-np] [-sp] [--min [min]]
                                    [--max [max]] [--ncyc [ncyc]]

    Benchmarks the performance of linear assignment problem solvers for
    random cost matrices of different dimensions.


optional arguments:
  -h, --help       show this help message and exit
  -c, --printcost  Print the minimum cost. The default is false, i.e. will not
                   print the minimum cost
  -v, --verbose    Determines verbosity. The default is minimal printing, i.e.
                   not verbose
  -np, --noplot    Do not plot data using matplotlib. The default is to save
                   plot of the data in PNG format, but not open the plot GUI
  -sp, --showplot  Show plot of data using matplotlib. The default is to save
                   plot of the data in PNG format, but not open the plot GUI
  --min [min]      minimum dimension of cost matrix to solve. The default is 8
                   (2^3 x 2^3)
  --max [max]      maximum dimension of cost matrix to solve. The default is
                   4096 (2^12 x 2^12)
  --ncyc [ncyc]    number of times to solve cost matrices and average their
                   timing. The default is 3 cycles

    The script  will produce the following:
    1) data of timing for LAP solving random cost matrices of
    dimensions 2^{min} - 2^{max}
    2) plot of timing for LAP solving random cost matrices of
    dimensions 2^{min} - 2^{max}

Examples

command execution note
python3 ./benchmark-lap-solvers.py python3 ./benchmark-lap-solvers.py --ncyc 3 --min 8 --max 4096 default
python3 ./benchmark-lap-solvers.py --min 2 --max 512 python3 ./benchmark-lap-solvers.py --ncyc 3 --min 2 --max 512 default, except it looks at small matrices only
python3 ./benchmark-lap-solvers.py -np python3 ./benchmark-lap-solvers.py --ncyc 3 --min 8 --max 4096 -np default, except plotting is suppressed
python3 ./benchmark-lap-solvers.py --printcost python3 ./benchmark-lap-solvers.py --ncyc 3 --min 8 --max 4096 --printcost default, except it prints lowest cost for each method

If you want to add other solvers to the list, it should be easy to figure out what parts to update in the scripts.

Requirements

  • numpy module. If you don't have it already, you can install it using pip3 or conda. Most of the packages should be available in the default Conda channels/repos, but you may have to search a little harder for others.
    • pip3 install numpy
    • conda install numpy
  • matplotlib module
    • pip3 install matplotlib
    • conda install matplotlib
  • scipy module (install version 1.4+ with Python 3.5+ for a fast C++ implementation)
    • pip3 install scipy==1.4
    • conda install scipy
  • munkres module by Brian Clapper.
    • pip3 install munkres
    • conda install munkres
  • hungarian module by Harold Cooper (does not work with Python 3.5+)
    • pip3 install hungarian
    • conda install -c psi4 hungarian
  • lap module by Tomas Kozmar.
    • pip3 install lap
    • conda install lap
  • lapjv module by src{d} for Python3
    • pip3 install lapjv
  • lapsolver module by Christoph Heindl
    • pip3 install lapsolver
    • conda install -c loopbio lapsolver
  • laptoools module by jdomoorman
    • pip3 install laptools (Python 3.5+)

Output

The script will produce output similar to what's shown below. Some things to note are:

  • The timings here corresponds to an average of three Python 3.5.6 runs on CentOS 7 machine with 2.4 GHz Intel Xeon Gold 6148 processor and 192GB of RAM
  • The random matrices are filled with floating point numbers ranging from 0 to the size (# of rows or columns) of the matrix. They are generated using numpy: cost_matrix = matrix_size * np.random.random((matrix_size, matrix_size))
  • Data of timing for solving LAP of random cost matrices of sizes 2min x 2min to 2max x 2max.
Solving matrices of sizes up to 2^{n} where n is {'lapsolver': 15, 'lap_lapjv': 15, 'munkres': 7, 'hungarian': 12, 'lapjv_lapjv': 15, 'scipy': 15}

8 x 8 ... 

Cycle  0   lapjv_lapjv  lap_lapjv  scipy  lapsolver  hungarian  munkres
Cycle  1   lapjv_lapjv  lap_lapjv  scipy  lapsolver  hungarian  munkres
Cycle  2   lapjv_lapjv  lap_lapjv  scipy  lapsolver  hungarian  munkres

.
.
.

16384 x 16384 ... 

Cycle  0   lapjv_lapjv  lap_lapjv  scipy  lapsolver  
Cycle  1   lapjv_lapjv  lap_lapjv  scipy  lapsolver  
Cycle  2   lapjv_lapjv  lap_lapjv  scipy  lapsolver 


Package Versions for the current run
Python -  3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 21:41:56) 
[GCC 7.3.0]
lap  -  0.4.0
scipy  -  1.4.0
lapsolver  -  1.0.2
munkres  -  1.0.12


Matrix_size  [  8        16       32       64       128      256      512      1024     2048     4096      8192     16384
lapjv_lapjv  [  0.00001  0.00001  0.00002  0.00004  0.00011  0.00066  0.00659  0.02742  0.13955  0.74462   3.05277  14.64501]
lap_lapjv    [  0.00005  0.00003  0.00004  0.00006  0.00018  0.00104  0.00795  0.03303  0.15438  1.92253   7.41732  50.84391]
scipy        [  0.00006  0.00005  0.00008  0.00018  0.00067  0.0025   0.01355  0.06346  0.34247  1.88682   9.30888  52.81564]
lapsolver    [  0.00002  0.00001  0.00003  0.0001   0.00038  0.00214  0.01347  0.07315  0.40603  2.47342   12.3546  77.90600]
hungarian    [  0.00001  0.00001  0.00004  0.00013  0.00071  0.00429  0.03393  0.24279  1.87886  15.30685  ]
munkres      [  0.00049  0.00384  0.04524  0.34832  3.26252  ]

Figure saved to file timing-LAPs-py3-8-32768.png


  • plot of timing for LAP solving random cost matrices of sizes 2min x 2min to 2max x 2max, where min and max are limited to smaller numbers for munkres and scipy in the interest of time.

alt text

If requested via the --printcost flag, it will also print the minimum cost for each random cost matrix by each implementation. This test ensures that the methods are making consistent/correct assignments.

Takeaways

  1. scipy==1.4 is much faster than previous versions and it is competitive with the other implementations, especially for larger matrices. This is a great development since it probably gets used more than the other implementations by virtue of scipy's popularity.
  2. munkres is much slower than hungarian, lapsolver, scipy, lap.lapjv, and lapjv.lapjv for all matrix sizes
  3. hungarian performs well for smaller matrices. For anything larger than 256x256, lapsolver, lap.lapjv and lapjv.lapjv are about an order of magnitude faster than hungarian
  4. lap.lapjv is am implementation intended to solve dense matrices. Its sparse matrix solver analog named lap.lapmod is more efficient for larger sparse matrices. Both are implemented in the lap module.
  5. lapjv.lapjv has the best performance virtually for all matrix sizes.
  6. For the purposes of improving Arbalign, hungarian remains a good choice for most molecular systems I'm interested in which don't have more than 100x100 distance matrices the same type to solve. However, if the tool is to be applied to larger molecules such as proteins and DNA, it would be worthwhile to use lapjv.lapjv, lapsolver, lap.lapjv or lap.lapmod
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].