All Projects → ensozos → Matrix-Profile

ensozos / Matrix-Profile

Licence: MIT license
A Java library for Matrix Profile

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Matrix-Profile

khiva-ruby
High-performance time series algorithms for Ruby
Stars: ✭ 27 (+92.86%)
Mutual labels:  time-series, matrix-profile
tsmp
R Functions implementing UCR Matrix Profile Algorithm
Stars: ✭ 63 (+350%)
Mutual labels:  time-series, matrix-profile
timemachines
Predict time-series with one line of code.
Stars: ✭ 342 (+2342.86%)
Mutual labels:  time-series
sknifedatar
sknifedatar is a package that serves primarily as an extension to the modeltime 📦 ecosystem. In addition to some functionalities of spatial data and visualization.
Stars: ✭ 30 (+114.29%)
Mutual labels:  time-series
Prediction-using-Bayesian-Neural-Network
Prediction of continuous signals data and Web tracking data using dynamic Bayesian neural network. Compared with other network architectures aswell.
Stars: ✭ 28 (+100%)
Mutual labels:  time-series
data-viz-utils
Functions for easily making publication-quality figures with matplotlib.
Stars: ✭ 16 (+14.29%)
Mutual labels:  time-series
greycat
GreyCat - Data Analytics, Temporal data, What-if, Live machine learning
Stars: ✭ 104 (+642.86%)
Mutual labels:  time-series
awesome-Python-data-science-books
Probably the best curated list of data science books in Python
Stars: ✭ 331 (+2264.29%)
Mutual labels:  time-series
renewcast
Renewcast: Forecasting Renewable Electricity Generation in EU Countries.
Stars: ✭ 28 (+100%)
Mutual labels:  time-series
pytorch-timeseries
PyTorch implementations of neural networks for timeseries classification
Stars: ✭ 76 (+442.86%)
Mutual labels:  time-series
ticktock
An OpenTSDB-like time series database, with much better performance.
Stars: ✭ 34 (+142.86%)
Mutual labels:  time-series
unicornn
Official code for UnICORNN (ICML 2021)
Stars: ✭ 21 (+50%)
Mutual labels:  time-series
MarketData.jl
Time series market data
Stars: ✭ 118 (+742.86%)
Mutual labels:  time-series
msda
Library for multi-dimensional, multi-sensor, uni/multivariate time series data analysis, unsupervised feature selection, unsupervised deep anomaly detection, and prototype of explainable AI for anomaly detector
Stars: ✭ 80 (+471.43%)
Mutual labels:  time-series
wavefront-proxy
Wavefront Proxy Project
Stars: ✭ 45 (+221.43%)
Mutual labels:  time-series
Deep XF
Package towards building Explainable Forecasting and Nowcasting Models with State-of-the-art Deep Neural Networks and Dynamic Factor Model on Time Series data sets with single line of code. Also, provides utilify facility for time-series signal similarities matching, and removing noise from timeseries signals.
Stars: ✭ 83 (+492.86%)
Mutual labels:  time-series
Awesome-Human-Activity-Recognition
An up-to-date & curated list of Awesome IMU-based Human Activity Recognition(Ubiquitous Computing) papers, methods & resources. Please note that most of the collections of researches are mainly based on IMU data.
Stars: ✭ 72 (+414.29%)
Mutual labels:  time-series
CoronaDash
COVID-19 spread shiny dashboard with a forecasting model, countries' trajectories graphs, and cluster analysis tools
Stars: ✭ 20 (+42.86%)
Mutual labels:  time-series
timegan-pytorch
This repository is a non-official implementation of TimeGAN (Yoon et al., NIPS2019) using PyTorch.
Stars: ✭ 46 (+228.57%)
Mutual labels:  time-series
notebooks
Code examples for pyFTS
Stars: ✭ 40 (+185.71%)
Mutual labels:  time-series

Matrix Profile Library

A Java library for Matrix Profile

Maven Central

Overview

matrix profile exmple

The Matrix Profile, has the potential to revolutionize time series data mining because of its generality, versatility, simplicity and scalability. In particular it has implications for time series motif discovery, time series joins, shapelet discovery (classification), density estimation, semantic segmentation, visualization, rule discovery, clustering etc

The advantages of using the Matrix Profile (over hashing, indexing, brute forcing a dimensionality reduced representation etc.) for most time series data mining tasks include:

  • It is exact: For motif discovery, discord discovery, time series joins etc., the Matrix Profile based methods provide no false positives or false dismissals.
  • It is simple and parameter-free: In contrast, the more general spatial access method algorithms typically require building and tuning spatial access methods and/or hash function.
  • It is space efficient: Matrix Profile construction algorithms requires an inconsequential space overhead, just linear in the time series length with a small constant factor, allowing massive datasets to be processed in main memory.
  • It allows anytime algorithms: While our exact algorithms are extremely scalable, for extremely large datasets we can compute the Matrix Profile in an anytime fashion, allowing ultra-fast approximate solutions.
  • It is incrementally maintainable: Having computed the Matrix Profile for a dataset, we can incrementally update it very efficiently. In many domains this means we can effectively maintain exact joins/motifs/discords on streaming data forever.
  • It does not require the user to set similarity/distance thresholds: For time series joins, the Matrix Profile provides full joins, eliminating the need to specify a similarity threshold, which is an unintuitive task for time series.
  • It can leverage hardware: Matrix Profile construction is embarrassingly parallelizable, both on multicore processors and in distributed systems.
  • It has time complexity that is constant in subsequence length: This is a very unusual and desirable property; all known time series join/motif/discord algorithms scale poorly as the subsequence length grows. In contrast, we have shown time series joins/motifs with subsequences lengths up to 100,000, at least two orders of magnitude longer than any other work we are aware of.
  • It can be constructed in deterministic time: All join/motif/discord algorithms we are aware of can radically different times to finish on two (even slightly) different datasets. In contrast, given only the length of the time series, we can precisely predict in advance how long it will take to compute the Matrix Profile
  • It can handle missing data: Even in the presence of missing data, we can provide answers which are guaranteed to have no false negatives.

For more information about matrix profile check out The UCR Matrix Profile Page

In version 0.0.2 we implemented MPdist measure. The useful properties of the MPdist include:

  • Ability to compare time series of different lengths.
  • Robust to spikes, dropouts, wandering baseline, missing values, and other issues that are common outside of benchmark dataset.
  • Invariance to amplitude and offset offered by DTW and Euclidean distance as well as additional invariances, including phase invariance, order invariance, liner trend invariance, and stutter invariance.
  • Allows scalability

We followed the fast MPdist algorithm that can be found in section "Speeding up MPdist Search" in the official paper.

Usage

  • MPdist (query by content)
  • STMP (two time series)
  • STMP (self join)
  • STAMP (with two time series)
  • STAMP (self join)
  • Misc:
    • MASS v2.0
    • Fast moving SD
    • Fast moving average

Note: We are using ND4j as time series representation. You can find more information about ND4j here

Installation

You can pull Matrix Profile library from the central maven repository, just add to pom.xml file:

<dependency>
   <groupId>io.github.ensozos</groupId>
   <artifactId>matrix-profile</artifactId>
   <version>0.0.3</version>
</dependency>

For gradle users add this to build.gradle:

compile 'io.github.ensozos:matrix-profile:0.0.3'

Example

The user needs to create a MatrixProfile profile object and pass the time series (INDArrays) as parameters:

  MatrixProfile matrixProfile = new MatrixProfile();
  
  int window = 4;
  INDArray target = Nd4j.create(new double[]{0.0, 6.0, -1.0, 2.0, 3.0, 1.0, 4.0}, new int[]{1, 7});
  INDArray query = Nd4j.create(new double[]{1.0, 2.0, 0.0, 0.0, -1}, new int[]{1, 5});

  matrixProfile.stamp(target, query, window);

For Matrix Profile distance you need to create MPdistance object:

  MPdistance mpDist = new MPdistance();
  
  int window = ...;
  INDArray target = Nd4j.create(...);
  INDArray query = Nd4j.create(...);

  mpDist.getMPdistance(target, query, window);

Other projects with Matrix Profile

License

Distributed under the MIT license. See LICENSE for more information.

Paper Citation

  • Yeh, Chin-Chia Michael & Zhu, Yan & Ulanova, Liudmila & Begum, Nurjahan & Ding, Yifei & Dau, Anh & Silva, Diego & Mueen, Abdullah & Keogh, Eamonn. (2016). Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View That Includes Motifs, Discords and Shapelets. 1317-1322. 10.1109/ICDM.2016.0179.

  • Abdullah Mueen, Yan Zhu, Michael Yeh, Kaveh Kamgar, Krishnamurthy Viswanathan, Chetan Kumar Gupta and Eamonn Keogh (2015), The Fastest Similarity Search Algorithm for Time Series Subsequences under Euclidean Distance

  • Matrix Profile XII: MPdist: A Novel Time Series Distance Measure to Allow Data Mining in More Challenging Scenarios. Shaghayegh Gharghabi, Shima Imani, Anthony Bagnall, Amirali Darvishzadeh, Eamonn Keogh. ICDM 2018.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].