Patent attributes
Embodiments describe a method for speech endpoint detection including receiving identification data for a first state associated with a first frame of speech data from a WFST language model, determining that the first frame of the speech data includes silence data, incrementing a silence counter associated with the first state, copying a value of the silence counter of the first state to a corresponding silence counter field in a second state associated with the first state in an active state list, and determining that the value of the silence counter for the first state is above a silence threshold. The method further includes, determining that an endpoint of the speech has occurred in response to determining that the silence counter is above the silence threshold, and outputting text data representing a plurality of words determined from the speech data that was received prior to the endpoint.