A voice recognition method includes: detecting a vocal section including a vocal sound in a voice, based on a feature value of an audio signal representing the voice; identifying a word expressed by the vocal sound in the vocal section, by matching the feature value of the audio signal of the vocal section and an acoustic model of each of a plurality of words; and selecting, with a processor, the word expressed by the vocal sound in a word section based on a comparison result between a signal characteristic of the word section and a signal characteristic of the vocal section.