Diffusion Maps and Geometric Harmonics for Python
Overview
The diffusion-maps
library for Python provides a fast and accurate implementation of diffusion maps[fn:1] and geometric harmonics[fn:2]. Its speed stems from the use of sparse linear algebra and (optionally) graphics processing units to accelerate computations.
The included code routinely solves eigenvalue problems 3 x faster than SciPy using GPUs on matrices with over 200 million non-zero entries.
The package includes a command-line utility for the quick calculation of diffusion maps on data sets.
Some of the features of the diffusion-maps
module include:
- Fast evaluation of distance matrices using nearest neighbors.
- Fast and accurate computation of eigenvalue/eigenvector pairs using sparse linear algebra.
- Optional GPU-accelerated sparse linear algebra routines.
- Optional interface to the ARPACK-NG library.
- Simple and easily modifiable code.
[fn:1] Coifman, R. R., & Lafon, S. (2006). Diffusion maps. Applied and Computational Harmonic Analysis, 21(1), 5–30. http://doi.org/10.1016/j.acha.2006.04.006
[fn:2] Coifman, R. R., & Lafon, S. (2006). Geometric harmonics: A novel tool for multiscale out-of-sample extension of empirical functions. Applied and Computational Harmonic Analysis, 21(1), 31–52. http://doi.org/10.1016/j.acha.2005.07.005
Prerequisites
The library is implemented in Python 3.5+ and uses NumPy and SciPy. It is recommended to install PyCUDA to enable the GPU-accelerated eigenvalue solver.
The diffusion-maps
command can display the resulting diffusion maps using Matplotlib if it is available.
Installation
Use python setup.py install
to install on your system or python setup.py install --user
for a user-specific installation.
Command-line utility
The diffusion-maps
command reads data sets stored in NPY, CSV, or MATLAB’s MAT format. The simplest way to use it is to invoke it as follows:
diffusion-maps DATA-SET.NPY EPSILON-VALUE
There exist parameters to save and visualize different types of results, to specify how many eigenvalue/eigenvector pairs to compute, etc. See the help page displayed by:
diffusion-maps --help
Additional documentation
Sphinx-based API documentation is available in the doc/
folder. Run
make -C doc html
to build the documentation.
License
This code is released under the MIT license. See LICENSE
for details.
Citation
If you use this code in publications, please cite it as:
- Juan M. Bello-Rivas. (2017, May 20). jmbr/diffusion-maps 0.0.1 (Version 0.0.1). Zenodo. http://doi.org/10.5281/zenodo.581667
Acknowledgments
The diffusion-maps
library has originally been written by Juan M. Bello-Rivas.
Others have further contributed to diffusion-maps
by reporting problems,
suggesting various improvements, or submitting actual code. Here is a list of
these people. Help me keep it complete and exempt of errors.
- Felix Dietrich,
- Mahdi Kooshkbaghi,
- Daniel Lehmberg,
- Philipp Schuegraf