
Automatic Identification & Fingerprinting of audio
The basic concept behind a fingerprinting system is to identify a piece of audio content by extracting a compact and unique signature from it (so-called content-based identification). In a training phase, such signatures are created from a set of known audio material, and finally stored in a database.
Unknown content can then be identified by comparing its signature with those contained in the database.
Performance of the AudioID System
In order to assess the system's recognition performance, the registered audio items are subjected to a wide range of signal manipulations which influence the audio signal's quality (e.g. equalization, acoustic transmission or MP3 encoding/decoding).
Similar to human recognition behavior, which is surprisingly tolerant even to bad sounding signal alterations, the system is designed to be robust against acoustic interference.
Depending on the type of signal distortion applied, the achieved recognition rates are typically better than 99% with a recognition speed (on standard PC hardware) several orders of magnitude faster than the audio playback time.
AudioID & MPEG-7 Audio
The AudioID system relies on a description core which has been standardized within the new MPEG-7 Audio Standard. It has the following benefits:
Identification relies on a published, open feature format rather than proprietary solutions.
MPEG-7 based signatures are likely to be produced as part of the standard metadata package which will accompany future advanced media formats.
Due to the exact and standardized specification of the descriptor, inter operability is guaranteed on a worldwide basis, i.e. every search engine relying on the MPEG-7 specification will be able to use compliant descriptions, wherever they may have been produced.
As a unique feature, AudioID MPEG-7 signatures are scalable, i.e. they allow a flexible trade-off between signature compactness and recognition robustness.
