Several features were compared with regard to recognition performance
in a musical instrument recognition system. Both mel-frequency and
linear prediction cepstral and delta cepstral coefficients were
calculated. Linear prediction analysis was carried out both on a
uniform and a warped frequency scale, and reflection coefficients were
also used as features. The performance of earlier described features
relating to the temporal development, modulation properties,
brightness, and spectral synchronity of sounds was also analyzed. The
data base consisted of 5288 acoustic and synthetic solo tones from 29
different Western orchestral instruments, out of which 16 instruments
were included in the test set. The best performance for solo tone
recognition, 35 % for individual instruments and 65 % for families,
was obtained with a feature set consisting of two sets of mel-
frequency cepstral coefficients and a subset of the other analyzed
features. The confusions made by the system were analyzed and
compared to results reported in a human perception experiment.