Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → Sciss → Strugatzki

Sciss / Strugatzki

Licence: other

Algorithms for matching audio file similarities. Mirror of https://git.iem.at/sciss/Strugatzki

Programming Languages

scala

5932 projects

Labels

signal-processing feature-extraction music-information-retrieval

Projects that are alternatives of or similar to Strugatzki

Meyda

Audio feature extraction for JavaScript.

Stars: ✭ 792 (+1984.21%)

Mutual labels: feature-extraction, music-information-retrieval

Alignmentduration

Lyrics-to-audio-alignement system. Based on Machine Learning Algorithms: Hidden Markov Models with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes. The phonetic model are classified with MLP Deep Neural Network.

Stars: ✭ 36 (-5.26%)

Mutual labels: music-information-retrieval, signal-processing

Aca Code

Matlab scripts accompanying the book "An Introduction to Audio Content Analysis" (www.AudioContentAnalysis.org)

Stars: ✭ 67 (+76.32%)

Mutual labels: music-information-retrieval, signal-processing

Audioowl

Fast and simple music and audio analysis using RNN in Python 🕵️‍♀️ 🥁

Stars: ✭ 151 (+297.37%)

Mutual labels: feature-extraction, music-information-retrieval

antropy

AntroPy: entropy and complexity of (EEG) time-series in Python

Stars: ✭ 111 (+192.11%)

Mutual labels: signal-processing, feature-extraction

spafe

🔉 spafe: Simplified Python Audio Features Extraction

Stars: ✭ 310 (+715.79%)

Mutual labels: signal-processing, music-information-retrieval

bob

Bob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at Idiap Research Institute, in Switzerland. - Mirrored from https://gitlab.idiap.ch/bob/bob

Stars: ✭ 38 (+0%)

Mutual labels: signal-processing, feature-extraction

Surfboard

Novoic's audio feature extraction library

Stars: ✭ 318 (+736.84%)

Mutual labels: feature-extraction, signal-processing

Speech Feature Extraction

Feature extraction of speech signal is the initial stage of any speech recognition system.

Stars: ✭ 78 (+105.26%)

Mutual labels: signal-processing, feature-extraction

Music-Genre-Classification

Automatic Music Genre Classification with Machine Learning Techniques

Stars: ✭ 49 (+28.95%)

Mutual labels: signal-processing, music-information-retrieval

MixingBear

Package for automatic beat-mixing of music files in Python 🐻🎚

Stars: ✭ 73 (+92.11%)

Mutual labels: feature-extraction, music-information-retrieval

Madmom

Python audio and music signal processing library

Stars: ✭ 728 (+1815.79%)

Mutual labels: music-information-retrieval, signal-processing

Tfidf

Simple TF IDF Library

Stars: ✭ 6 (-84.21%)

Mutual labels: feature-extraction

Cbir System

Content-Based Image Retrieval system (KTH DD2476 Project)

Stars: ✭ 9 (-76.32%)

Mutual labels: feature-extraction

Gnss Sdr

GNSS-SDR, an open-source software-defined GNSS receiver

Stars: ✭ 801 (+2007.89%)

Mutual labels: signal-processing

Pysiology

A Python package for physyological's signals processing

Stars: ✭ 32 (-15.79%)

Mutual labels: signal-processing

Musicinformationretrieval.com

Instructional notebooks on music information retrieval.

Stars: ✭ 845 (+2123.68%)

Mutual labels: music-information-retrieval

Sincnet

SincNet is a neural architecture for efficiently processing raw audio samples.

Stars: ✭ 764 (+1910.53%)

Mutual labels: signal-processing

Obspy

ObsPy: A Python Toolbox for seismology/seismological observatories.

Stars: ✭ 756 (+1889.47%)

Mutual labels: signal-processing

Pykaldi

A Python wrapper for Kaldi

Stars: ✭ 756 (+1889.47%)

Mutual labels: feature-extraction

View All Similar Projects ➔

Strugatzki

statement

Strugatzki is a Scala library containing several algorithms for audio feature extraction, with the aim of similarity and dissimilarity measurements. They have been originally used in my live electronic piece "Inter-Play/Re-Sound", then successively in the tape piece "Leere Null", the sound installation "Writing Machine", and the tape piece "Leere Null (2)".

(C)opyright 2011–2019 by Hanns Holger Rutz. All rights reserved. It is released under the GNU Lesser General Public License v2.1+ and comes with absolutely no warranties. To contact the author, send an email to contact at sciss.de.

requirements / installation

Builds with sbt against Scala 2.13, 2.12, 2.11. Depends on ScalaCollider and scopt.

Strugatzki can be either used as a standalone command line tool, or embedded in your project as a library.

contributing

Please see the file CONTRIBUTING.md

running

standalone use

This assumes you check out Strugatzki from source, as the easiest way to use it in the terminal is via the sbt prompt. First, start sbt without arguments. In the sbt shell, execute run which will print the switches for the different modules:

-f | --feature
      Feature extraction
-c | --correlate
      Find best correlation with database
-s | --segmentation
      Find segmentation breaks with a file
-x | --selfsimilarity
      Create an image of the self similarity matrix
--stats
      Statistics from feature database

To find out the switches for the extraction module: run -f. This will print the particular options available for this module. While in the API times are all given in sample frames with respect to the original sound file's sample rate, the standalone/ terminal mode assumes times are all given as floating point seconds.

Another possibility is to build the standalone via sbt assembly and then execute it with shell script ./strugatzki

library use

If you build your project with sbt, the following line adds a dependency for Strugatzki:

"de.sciss" %% "strugatzki" % v

The current version v is "2.19.0".

As documentation you are referred to the API docs at the moment. These can be created in the standard way (sbt doc). The main classes to look are FeatureExtraction, FeatureCorrelation, and FeatureSegmentation. They are used in a similar fashion. E.g. to run feature extraction:

import de.sciss.processor._
import de.sciss.strugatzki._
import de.sciss.file._
import scala.concurrent.ExecutionContext.Implicits._

val fs           = FeatureExtraction.Config()
fs.audioInput    = file("my-audio-input")
fs.featureOutput = file("my-feature-aiff-output")
fs.metaOutput    = Some(file("my-meta-data-xml-output"))  // optional

// the process is constructed with the settings and a partial function which
// acts as a process observer
val f = FeatureExtraction.run(fs) {
  case Processor.Result(_, _) => println("Done.")
}
// f is a `Future` of the result you may want to work with

For the detailed settings, such as FFT size, number of MFCC, etc., please refer to the API docs.

algorithms

Strugatzki is not a full fledged MIR system, but was rather born of my personal preference and experience, resulting in an API which is a bit idiosyncratic, but nevertheless completely independent of my specific use cases.

The feature vectors used are spectral envelope as defined by the Mel Frequency Cepstral Coefficients (MFCC) and the Loudness in Sones. The actual DSP algorithms responsible for their extraction are the MFCC and Loudness UGens included with SuperCollider, which were written by Dan Stowell and Nick Collins. They are used behind the scenes, running ScalaCollider in Non-Realtime-Mode.

In most processes, there is a parameter temporalWeight which specifies the weight assigned to MFCC versus loudness. A temporal weight of 0.0 means the temporal feature vector (loudness) is not taken into account, and a weight of 1.0 means that only the loudness is taken into account, while the spectral features (MFCC) are ignored.

The correlation, segmentation, and so forth are performed directly in Scala, using dedicated threads, providing an API for monitoring completion, failure, progress, and an abortion hook. As of the current version, all processes run single-threaded, so there is plenty of headroom for future performance boosts by providing some forms of parallelism. Strugatzki is an artistic and a research project, not a commercial application, so beware that it is not the fastest MIR system imaginable.

The feature vectors (MFCC and loudness) are calculated on a frame-by-frame basis using a sliding (FFT) window. They are written out as a regular AIFF sound file, which is a convenient format for storing evenly sampled multichannel floating point streams. Accompanied by a dedicated XML file which contains the extraction settings for future reference and use by the other algorithms.

There are two main algorithms that operate on the extracted features: The correlation module is capable of finding sound in a database that match a target sound in terms of similarity or dissimilarity. The segmentation module is capable of suggesting breaking points in a single target sound on the basis of novelty or maximisation of dissimilarity within a given time window.

normalization

We have found it quite useful to normalize the MFCC by creating statistics over a large body of database sounds. Therefore, a particular stats module is provided which can scan a directory of feature extraction files and calculate the minimum and maximum ranges for each coefficient. In the standalone mode, these ranges can be written out to a dedicated AIFF file, and may be used for correlation and segmentation, yielding in our opinion better results.

self similarity

For analysis and visualisation purposes, we have added a self similarity module which produces a png image file with the self similarity matrix of a given feature file.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 38

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗