We present a prototype method of indexing raw-audio music files in a way that facilitates content-based similarity retrieval. The algorithm tries to capture the intuitive notion of similarity perceived by human: two pieces are similar if they are fully or partially based on the same score, even if they are performed by different people or at different speed.
Local peaks in signal power are identified in each audio file, and a spectral vector is extracted near each peak. Nearby peaks are selectively grouped together to form ``characteristic sequences'' which are the basis for indexing. A hashing scheme known as ``Locality-Sensitive Hashing'' is employed to index the high-dimensional vectors. Retrieval results are ranked based on the number of final matches filtered by some linearity criteria.