Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → AddictedCS → Soundfingerprinting

AddictedCS / Soundfingerprinting

Licence: other

Open source audio fingerprinting in .NET. An efficient algorithm for acoustic fingerprinting written purely in C#.

Labels

audio algorithm audio-processing recognition nearest-neighbor-search

Projects that are alternatives of or similar to Soundfingerprinting

Scaper

A library for soundscape synthesis and augmentation

Stars: ✭ 186 (-66.43%)

Mutual labels: audio, audio-processing

Chromaprint

C library for generating audio fingerprints used by AcoustID

Stars: ✭ 553 (-0.18%)

Mutual labels: audio, audio-processing

Emotion Classification From Audio Files

Understanding emotions from audio files using neural networks and multiple datasets.

Stars: ✭ 189 (-65.88%)

Mutual labels: audio, audio-processing

Dtln

Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.

Stars: ✭ 147 (-73.47%)

Mutual labels: audio, audio-processing

Surfboard

Novoic's audio feature extraction library

Stars: ✭ 318 (-42.6%)

Mutual labels: audio, audio-processing

Img Encode

Encode an image to sound and view it as a spectrogram - turn your images into music

Stars: ✭ 157 (-71.66%)

Mutual labels: audio, audio-processing

Otto

Sampler, Sequencer, Multi-engine synth and effects - in a box! [WIP]

Stars: ✭ 2,390 (+331.41%)

Mutual labels: audio, audio-processing

Libopenshot Audio

OpenShot Audio Library (libopenshot-audio) is a free, open-source project that enables high-quality editing and playback of audio, and is based on the amazing JUCE library.

Stars: ✭ 120 (-78.34%)

Mutual labels: audio, audio-processing

Nara wpe

Different implementations of "Weighted Prediction Error" for speech dereverberation

Stars: ✭ 265 (-52.17%)

Mutual labels: audio, audio-processing

Vchsm

C++ 11 algorithm implementation for voice conversion using harmonic plus stochastic models

Stars: ✭ 38 (-93.14%)

Mutual labels: algorithm, audio

Prism Media

Easily transcode media using Node.js 🎶

Stars: ✭ 136 (-75.45%)

Mutual labels: audio, audio-processing

Auto Editor

Auto-Editor: Effort free video editing!

Stars: ✭ 382 (-31.05%)

Mutual labels: audio, audio-processing

Avdemo

Demo projects for iOS Audio & Video development.

Stars: ✭ 136 (-75.45%)

Mutual labels: audio, audio-processing

Awesome Deep Learning Music

List of articles related to deep learning applied to music

Stars: ✭ 2,195 (+296.21%)

Mutual labels: audio, audio-processing

Noise reduction

Speech noise reduction which was generated using existing post-production techniques implemented in Python

Stars: ✭ 130 (-76.53%)

Mutual labels: audio, audio-processing

Mwengine

Audio engine and DSP for Android, written in C++ providing low latency performance in a musical context, supporting both OpenSL and AAudio.

Stars: ✭ 190 (-65.7%)

Mutual labels: audio, audio-processing

Edsp

A cross-platform DSP library written in C++ 11/14. This library harnesses the power of C++ templates to implement a complete set of DSP algorithms.

Stars: ✭ 116 (-79.06%)

Mutual labels: audio, audio-processing

Dawdreamer

Digital Audio Workstation with Python; VST instruments/effects, parameter automation, and native processors

Stars: ✭ 119 (-78.52%)

Mutual labels: audio, audio-processing

Julius

Open-Source Large Vocabulary Continuous Speech Recognition Engine

Stars: ✭ 1,258 (+127.08%)

Mutual labels: recognition, audio-processing

Musig

A shazam like tool to store songs fingerprints and retrieve them

Stars: ✭ 388 (-29.96%)

Mutual labels: audio, audio-processing

View All Similar Projects ➔

Audio fingerprinting and recognition in .NET

soundfingerprinting is a C# framework designed for companies, enthusiasts, researchers in the fields of audio and digital signal processing, data mining and audio recognition. It implements an efficient algorithm which provides fast insert and retrieval of acoustic fingerprints with high precision and recall rate.

Documentation

Below code snippet shows how to extract acoustic fingerprints from an audio file and later use them as identifiers to recognize unknown audio query. These sub-fingerprints (or fingerprints, two terms used interchangeably) will be stored in a configurable datastore.

private readonly IModelService modelService = new InMemoryModelService(); // store fingerprints in RAM
private readonly IAudioService audioService = new SoundFingerprintingAudioService(); // default audio library

public async Task StoreForLaterRetrieval(string pathToAudioFile)
{
    var track = new TrackInfo("GBBKS1200164", "Skyfall", "Adele");

    // create fingerprints
    var hashedFingerprints = await FingerprintCommandBuilder.Instance
                                .BuildFingerprintCommand()
                                .From(pathToAudioFile)
                                .UsingServices(audioService)
                                .Hash();
								
    // store hashes in the database for later retrieval
    modelService.Insert(track, hashedFingerprints);
}

Querying

Once you've inserted the fingerprints into the datastore, later you might want to query the storage in order to recognize the song those samples you have. The origin of query samples may vary: file, URL, microphone, radio tuner, etc. It's up to your application, where you get the samples from.

public async Task<TrackData> GetBestMatchForSong(string queryAudioFile)
{
    int secondsToAnalyze = 10; // number of seconds to analyze from query file
    int startAtSecond = 0; // start at the begining
	
    // query the underlying database for similar audio sub-fingerprints
    var queryResult = await QueryCommandBuilder.Instance.BuildQueryCommand()
                                         .From(queryAudioFile, secondsToAnalyze, startAtSecond)
                                         .UsingServices(modelService, audioService)
                                         .Query();
    
    return queryResult.BestMatch.Track;
}

Fingerprints Storage

The default storage, which comes bundled with soundfingerprinting NuGet package, is a plain in-memory storage, available via InMemoryModelService class. If you plan to use an external persistent storage for audio fingerprints Emy is the preferred choice. It is a specialized storage developed for audio fingerprints. Emy provides a community version which is free for non-commercial use. You can try it with docker:

docker run -d -v /persistent-dir:/app/data -p 3399:3399 -p 3340:3340 addictedcs/soundfingerprinting.emy:latest

Emy provides a backoffice interface which you can access on port :3340. In order to insert and query Emy server please install SoundFingerprinting.Emy NuGet package.

Install-Package SoundFingerprinting.Emy

The package will provide you with EmyModelService class, which can substitute default InMemoryModelService.

 // connect to Emy on port 3399
 var emyModelService = EmyModelService.NewInstance("localhost", 3399);
 
 // query Emy database
 var queryResult = await QueryCommandBuilder.Instance.BuildQueryCommand()
                                         .From(queryAudioFile, secondsToAnalyze, startAtSecond)
                                         .UsingServices(modelService, audioService)
                                         .Query();
					
// register matches s.t. they appear in the dashboard					
emyModelService.RegisterMatches(queryResult.ResultEntries);

Registering matches is now possible with EmyModelService. The results will be displayed in the Emy dashboard.

Similarly, SoundFingerprinting.Emy provides FFmpegAudioService, which supports a wide variety of formats for both audio and video fingerprinting. More details about FFmpegAudioService can be found below.

If you plan to use Emy storage in a commercial project please contact [email protected] for details. Enterprise version is ~12.5x faster when number of tracks exceeds ~10K, supports clustering, replication and much more. By using Emy you will also support core SoundFingerprinting library and its ongoing development.

Previous storages are now considered deprecate, as Emy is now considered the default choice for persistent storage.

Solr non-relational storage soundfingerprinting.solr. MIT licensed, useful when the number of tracks does not exceed 5000 tracks [deprecated].
MSSQL soundfingerprinrint.sql [deprecated]. MIT licensed.

Supported audio formats

Read Supported Audio Formats page for details about different audio services and how you can use them in various operating systems.

Query result details

Every ResultEntry object will contain the following information:

Track - matched track from the datastore
QueryMatchLength - returns how many query seconds matched the resulting track
QueryMatchStartsAt - returns time position where resulting track started to match in the query
TrackMatchStartsAt - returns time position where the query started to match in the resulting track
TrackStartsAt - returns an approximation where does the matched track starts, always relative to the query
Confidence - returns a value between [0, 1]. A value below 0.15 is most probably a false positive. A value bigger than 0.15 is very likely to be an exact match. For good audio quality queries you can expect getting a confidence > 0.5.
MatchedAt - returns timestamp showing at what time did the match occured. Usefull for realtime queries.

Stats contains useful statistics information for fine-tuning the algorithm:

QueryDuration - time in milliseconds spend just querying the fingerprints datasource.
FingerprintingDuration - time in milliseconds spent generating the acousting fingerprints from the media file.
TotalTracksAnalyzed - total # of tracks analyzed during query time. If this number exceeds 50, try optimizing your configuration.
TotalFingerprintsAnalyzed - total # of fingerprints analyzed during query time. If this number exceeds 500, try optimizing your configuration.

Read Different Types of Coverage to understand how query coverage is calculated.

Version 6.2.0

Version 6.2.0 provides ability to query realtime datasources. Usefull for scenarious when you would like to monitor a realtime stream and get matching results as fast as possible.

Version 6.0.0

Version 6.0.0 provides a slightly improved IModelService interface. Now you can insert TrackInfo and it's corresponding fingerprints in one method call. The signatures of the fingerprints stayed the same, no need to re-index your tracks. Also, instead of inserting TrackData objects a new lightweight data class has been added: TrackInfo.

Version 5.2.0

Version 5.2.0 provides a query configuration option AllowMultipleMatchesOfTheSameTrackInQuery which will instruct the framework to consider the use case of having the same track matched multiple times within the same query. This is handy for long queries that can contain same match scattered across the query. Default value is false.

Version 5.1.0

Starting from version 5.1.0 the fingerprints signature has changed to be more resilient to noise. You can try HighPrecisionFingerprintConfiguration in case your audio samples come from recordings that contain ambient noise. All users that migrate to 5.1.x have to re-index the data, since fingerprint signatures from <= 5.0.x version are not compatible.

Version 5.0.0

Starting from version 5.0.0 soundfingerprinting library supports .NET Standard 2.0. You can run the application not only on Window environment but on any other .NET Standard compliant runtime.

Algorithm configuration

Fingerprinting and Querying algorithms can be easily parametrized with corresponding configuration objects passed as parameters on command creation.

 var hashDatas = await FingerprintCommandBuilder.Instance
                           .BuildFingerprintCommand()
                           .From(samples)
                           .WithFingerprintConfig(new HighPrecisionFingerprintConfiguration())
                           .UsingServices(audioService)
                           .Hash();

Similarly during query time you can specify a more high precision query configuration in case if you are trying to detect audio in noisy environments.

QueryResult queryResult = await QueryCommandBuilder.Instances
                                   .BuildQueryCommand()
                                   .From(PathToFile)
                                   .WithQueryConfig(new HighPrecisionQueryConfiguration())
                                   .UsingServices(modelService, audioService)
                                   .Query();

There are 3 pre-built configurations to choose from: LowLatency, Default, HighPrecision. Nevertheless you are not limited to use just these 3. You can ammed each particular configuration property by your own via overloads.

In case you need directions for fine-tunning the algorithm for your particular use case do not hesitate to contact me. Specifically if you are trying to use it on mobile platforms HighPrecisionFingerprintConfiguration may not be accurate enought.

Please use fingerprinting configuration counterpart during query (i.e. HighPrecisionFingerprintConfiguration with HighPrecisionQueryConfiguration). Different configuration analyze different spectrum ranges, thus they have to be used in pair.

Substituting audio or model services

Most critical parts of the soundfingerprinting framework are interchangeable with extensions. If you want to use NAudio as the underlying audio processing library just install SoundFingerprinting.Audio.NAudio package and substitute IAudioService with NAudioService. Same holds for database storages. Install the extensions which you want to use (i.e. SoundFingerprinting.Solr) and provide new ModelService where needed.

Third party dependencies

Links to the third party libraries used by soundfingerprinting project.

FAQ

Can I apply this algorithm for speech recognition purposes?

No. The granularity of one fingerprint is roughly ~1.46 seconds.

Can the algorithm detect exact query position in resulted track?

Yes.

Can I use SoundFingerprinting to detect ads in radio streams?

Yes. Actually this is the most frequent use-case where SoundFingerprinting was successfully used.

Will SoundFingerprinting match tracks with samples captured in noisy environment?

Yes, try out HighPrecision configurations, or contact me for additional guidance.

Can I use SoundFingerprinting framework on Mono or .NET Core app?

Yes. SoundFingerprinting can be used in cross-platform applications. Keep in mind though, cross platform audio service SoundFingerprintingAudioService supports only .wav files at it's input.

How many tracks can I store in InMemoryModelService?

100 hours of content with HighPrecision fingerprinting configuration will yeild in ~5GB or RAM usage.

Binaries

git clone [email protected]:AddictedCS/soundfingerprinting.git

In order to build latest version of the SoundFingerprinting assembly run the following command from repository root.

.\build.cmd

Get it on NuGet

Install-Package SoundFingerprinting

How it works

soundfingerprinting employs computer vision techniques to generate audio fingerprints. The fingerprints are generated from spectrogram images taken every N samples. Below is a 30 seconds long non-overlaping spectrogram cut at 318-2000Hz frequency range.

After a list of subsequent transformations these are converted into hashes, which are stored and used at query time. The fingerprints are robust to degradations to a certain degree. The DefaultFingerprintConfiguration class can be successfully used for radio stream monitoring. It handles well different audio formats, aliased signals and sampling differences accross tracks. More detailed article about how it works can be found on my blog.

Demo

My description of the algorithm alogside with the demo project can be found on CodeProject. The article is from 2011, and may be outdated. The demo project is a Audio File Duplicates Detector. Its latest source code can be found here. Its a WPF MVVM project that uses the algorithm to detect what files are perceptually very similar.

Contribute

If you want to contribute you are welcome to open issues or discuss on issues page. Feel free to contact me for any remarks, ideas, bug reports etc.

License

The framework is provided under MIT license agreement.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 554

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗