A music retrieval system which take an input melody as the query. In one embodiment, changes or differences in the distribution of energy across the frequency spectrum over time are used to find breakpoints in the input melody in order to separate it into distinct notes. In another embodiment the breakpoints are identified based on changes in pitch over time. A confidence level is preferably associated with each breakpoint and/or note extracted from the input melody. The confidence level is based on one or more of: changes in pitch, absolute values of a spectral energy distribution indicator, relative values of the spectral energy distribution indicator, and the energy level of the input melody. The process of matching the input melody with songs in the music database is based on minimizing a cost computation that takes into account errors in the insertion and deletion of notes, and penalizes these errors in accordance with the confidence levels of the breakpoints and/or notes.