Vocal Melody Extraction in the Presence of Pitched Accompaniment in Polyphonic Music
DSpace at IIT Bombay
View Archive InfoField | Value | |
Title |
Vocal Melody Extraction in the Presence of Pitched Accompaniment in Polyphonic Music
|
|
Creator |
RAO, V
RAO, P |
|
Subject |
fundamental-frequency estimation
singing voice source separation monaural recordings multipitch analysis signals speech audio algorithm tracking fundamental frequency estimation music information retrieval (mir) music transcription predominant pitch detection |
|
Description |
Melody extraction algorithms for single-channel polyphonic music typically rely on the salience of the lead melodic instrument, considered here to be the singing voice. However the simultaneous presence of one or more pitched instruments in the polyphony can cause such a predominant-F0 tracker to switch between tracking the pitch of the voice and that of an instrument of comparable strength, resulting in reduced voice-pitch detection accuracy. We propose a system that, in addition to biasing the salience measure in favor of singing voice characteristics, acknowledges that the voice may not dominate the polyphony at all instants and therefore tracks an additional pitch to better deal with the potential presence of locally dominant pitched accompaniment. A feature based on the temporal instability of voice harmonics is used to finally identify the voice pitch. The proposed system is evaluated on test data that is representative of polyphonic music with strong pitched accompaniment. Results show that the proposed system is indeed able to recover melodic information lost to its single-pitch tracking counterpart, and also outperforms another state-of-the-art melody extraction system designed for polyphonic music.
|
|
Publisher |
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
|
|
Date |
2011-08-01T19:35:58Z
2011-12-26T12:53:28Z 2011-12-27T05:39:31Z 2011-08-01T19:35:58Z 2011-12-26T12:53:28Z 2011-12-27T05:39:31Z 2010 |
|
Type |
Article
|
|
Identifier |
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 18(8), 2145-2154
1558-7916 http://dx.doi.org/10.1109/TASL.2010.2042124 http://dspace.library.iitb.ac.in/xmlui/handle/10054/8525 http://hdl.handle.net/10054/8525 |
|
Language |
en
|
|