All Projects → Sciss → Strugatzki

Sciss / Strugatzki

Licence: other
Algorithms for matching audio file similarities. Mirror of https://git.iem.at/sciss/Strugatzki

Programming Languages

scala
5932 projects

Projects that are alternatives of or similar to Strugatzki

Meyda
Audio feature extraction for JavaScript.
Stars: ✭ 792 (+1984.21%)
Mutual labels:  feature-extraction, music-information-retrieval
Alignmentduration
Lyrics-to-audio-alignement system. Based on Machine Learning Algorithms: Hidden Markov Models with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes. The phonetic model are classified with MLP Deep Neural Network.
Stars: ✭ 36 (-5.26%)
Mutual labels:  music-information-retrieval, signal-processing
Aca Code
Matlab scripts accompanying the book "An Introduction to Audio Content Analysis" (www.AudioContentAnalysis.org)
Stars: ✭ 67 (+76.32%)
Mutual labels:  music-information-retrieval, signal-processing
Audioowl
Fast and simple music and audio analysis using RNN in Python 🕵️‍♀️ 🥁
Stars: ✭ 151 (+297.37%)
Mutual labels:  feature-extraction, music-information-retrieval
antropy
AntroPy: entropy and complexity of (EEG) time-series in Python
Stars: ✭ 111 (+192.11%)
Mutual labels:  signal-processing, feature-extraction
spafe
🔉 spafe: Simplified Python Audio Features Extraction
Stars: ✭ 310 (+715.79%)
Mutual labels:  signal-processing, music-information-retrieval
bob
Bob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at Idiap Research Institute, in Switzerland. - Mirrored from https://gitlab.idiap.ch/bob/bob
Stars: ✭ 38 (+0%)
Mutual labels:  signal-processing, feature-extraction
Surfboard
Novoic's audio feature extraction library
Stars: ✭ 318 (+736.84%)
Mutual labels:  feature-extraction, signal-processing
Speech Feature Extraction
Feature extraction of speech signal is the initial stage of any speech recognition system.
Stars: ✭ 78 (+105.26%)
Mutual labels:  signal-processing, feature-extraction
Music-Genre-Classification
Automatic Music Genre Classification with Machine Learning Techniques
Stars: ✭ 49 (+28.95%)
Mutual labels:  signal-processing, music-information-retrieval
MixingBear
Package for automatic beat-mixing of music files in Python 🐻🎚
Stars: ✭ 73 (+92.11%)
Mutual labels:  feature-extraction, music-information-retrieval
Madmom
Python audio and music signal processing library
Stars: ✭ 728 (+1815.79%)
Mutual labels:  music-information-retrieval, signal-processing
Tfidf
Simple TF IDF Library
Stars: ✭ 6 (-84.21%)
Mutual labels:  feature-extraction
Cbir System
Content-Based Image Retrieval system (KTH DD2476 Project)
Stars: ✭ 9 (-76.32%)
Mutual labels:  feature-extraction
Gnss Sdr
GNSS-SDR, an open-source software-defined GNSS receiver
Stars: ✭ 801 (+2007.89%)
Mutual labels:  signal-processing
Pysiology
A Python package for physyological's signals processing
Stars: ✭ 32 (-15.79%)
Mutual labels:  signal-processing
Musicinformationretrieval.com
Instructional notebooks on music information retrieval.
Stars: ✭ 845 (+2123.68%)
Mutual labels:  music-information-retrieval
Sincnet
SincNet is a neural architecture for efficiently processing raw audio samples.
Stars: ✭ 764 (+1910.53%)
Mutual labels:  signal-processing
Obspy
ObsPy: A Python Toolbox for seismology/seismological observatories.
Stars: ✭ 756 (+1889.47%)
Mutual labels:  signal-processing
Pykaldi
A Python wrapper for Kaldi
Stars: ✭ 756 (+1889.47%)
Mutual labels:  feature-extraction

Strugatzki

Build Status Maven Central

statement

Strugatzki is a Scala library containing several algorithms for audio feature extraction, with the aim of similarity and dissimilarity measurements. They have been originally used in my live electronic piece "Inter-Play/Re-Sound", then successively in the tape piece "Leere Null", the sound installation "Writing Machine", and the tape piece "Leere Null (2)".

(C)opyright 2011–2019 by Hanns Holger Rutz. All rights reserved. It is released under the GNU Lesser General Public License v2.1+ and comes with absolutely no warranties. To contact the author, send an email to contact at sciss.de.

requirements / installation

Builds with sbt against Scala 2.13, 2.12, 2.11. Depends on ScalaCollider and scopt.

Strugatzki can be either used as a standalone command line tool, or embedded in your project as a library.

contributing

Please see the file CONTRIBUTING.md

running

standalone use

This assumes you check out Strugatzki from source, as the easiest way to use it in the terminal is via the sbt prompt. First, start sbt without arguments. In the sbt shell, execute run which will print the switches for the different modules:

-f | --feature
      Feature extraction
-c | --correlate
      Find best correlation with database
-s | --segmentation
      Find segmentation breaks with a file
-x | --selfsimilarity
      Create an image of the self similarity matrix
--stats
      Statistics from feature database

To find out the switches for the extraction module: run -f. This will print the particular options available for this module. While in the API times are all given in sample frames with respect to the original sound file's sample rate, the standalone/ terminal mode assumes times are all given as floating point seconds.

Another possibility is to build the standalone via sbt assembly and then execute it with shell script ./strugatzki

library use

If you build your project with sbt, the following line adds a dependency for Strugatzki:

"de.sciss" %% "strugatzki" % v

The current version v is "2.19.0".

As documentation you are referred to the API docs at the moment. These can be created in the standard way (sbt doc). The main classes to look are FeatureExtraction, FeatureCorrelation, and FeatureSegmentation. They are used in a similar fashion. E.g. to run feature extraction:

import de.sciss.processor._
import de.sciss.strugatzki._
import de.sciss.file._
import scala.concurrent.ExecutionContext.Implicits._

val fs           = FeatureExtraction.Config()
fs.audioInput    = file("my-audio-input")
fs.featureOutput = file("my-feature-aiff-output")
fs.metaOutput    = Some(file("my-meta-data-xml-output"))  // optional

// the process is constructed with the settings and a partial function which
// acts as a process observer
val f = FeatureExtraction.run(fs) {
  case Processor.Result(_, _) => println("Done.")
}
// f is a `Future` of the result you may want to work with

For the detailed settings, such as FFT size, number of MFCC, etc., please refer to the API docs.

algorithms

Strugatzki is not a full fledged MIR system, but was rather born of my personal preference and experience, resulting in an API which is a bit idiosyncratic, but nevertheless completely independent of my specific use cases.

The feature vectors used are spectral envelope as defined by the Mel Frequency Cepstral Coefficients (MFCC) and the Loudness in Sones. The actual DSP algorithms responsible for their extraction are the MFCC and Loudness UGens included with SuperCollider, which were written by Dan Stowell and Nick Collins. They are used behind the scenes, running ScalaCollider in Non-Realtime-Mode.

In most processes, there is a parameter temporalWeight which specifies the weight assigned to MFCC versus loudness. A temporal weight of 0.0 means the temporal feature vector (loudness) is not taken into account, and a weight of 1.0 means that only the loudness is taken into account, while the spectral features (MFCC) are ignored.

The correlation, segmentation, and so forth are performed directly in Scala, using dedicated threads, providing an API for monitoring completion, failure, progress, and an abortion hook. As of the current version, all processes run single-threaded, so there is plenty of headroom for future performance boosts by providing some forms of parallelism. Strugatzki is an artistic and a research project, not a commercial application, so beware that it is not the fastest MIR system imaginable.

The feature vectors (MFCC and loudness) are calculated on a frame-by-frame basis using a sliding (FFT) window. They are written out as a regular AIFF sound file, which is a convenient format for storing evenly sampled multichannel floating point streams. Accompanied by a dedicated XML file which contains the extraction settings for future reference and use by the other algorithms.

There are two main algorithms that operate on the extracted features: The correlation module is capable of finding sound in a database that match a target sound in terms of similarity or dissimilarity. The segmentation module is capable of suggesting breaking points in a single target sound on the basis of novelty or maximisation of dissimilarity within a given time window.

normalization

We have found it quite useful to normalize the MFCC by creating statistics over a large body of database sounds. Therefore, a particular stats module is provided which can scan a directory of feature extraction files and calculate the minimum and maximum ranges for each coefficient. In the standalone mode, these ranges can be written out to a dedicated AIFF file, and may be used for correlation and segmentation, yielding in our opinion better results.

self similarity

For analysis and visualisation purposes, we have added a self similarity module which produces a png image file with the self similarity matrix of a given feature file.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].