Itakura–Saito distance
The Itakura–Saito distance (or Itakura–Saito divergence) is a measure of the difference between an original spectrum and an approximation of that spectrum. Although it is not a perceptual measure it is intended to reflect perceptual (dis)similarity. It was proposed by Fumitada Itakura and Shuzo Saito in the 1960s while they were with NTT.[1]
The distance is defined as:[2]
The Itakura–Saito distance is a Bregman divergence, but is not a true metric since it is not symmetric[3] and it does not fulfil triangle inequality.
In Non-negative matrix factorization the Itakura-Saito divergence can be used as a measure of the quality of the factorization: this implies a meaningful statistical model of the components and can be solved through an iterative method.[4]
See also
References
- ↑ Itakura, F., & Saito, S. (1968). Analysis synthesis telephony based on the maximum likelihood method. In Proc. 6th of the International Congress on Acoustics (pp. C–17–C–20). Los Alamitos, CA: IEEE.
- ↑ Alan H. S. Chan; Sio-Iong Ao (2008). Advances in industrial engineering and operations research. Springer. p. 51. ISBN 978-0-387-74903-7.
- ↑ A. Banerjee; et al. (2004). "Clustering with Bregman Divergences". In Michael W. Berry; Umeshwar Dayal; Chandrika Kamath; David Skillicorn. Proceedings of the Fourth SIAM International Conference on Data Mining. SIAM. pp. 234–245. ISBN 978-0-89871-568-2.
- ↑ Cédric Févotte; Nancy Bertin; Jean-Louis Durrieu (2009). "Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis". Neural Computation. 21 (3): 793–830. doi:10.1162/neco.2008.04-08-771. PMID 18785855.