Patent attributes
The disclosed computer-implemented method may include receiving input voice data synchronous with a visual state of a user interface of the third-party application, generating multiple sentence alternatives for the received input voice data, identifying a best sentence of the multiple sentence alternatives, executing a dialog script for the third-party application using the best sentence, the dialog script generating a response to the received voice data comprising output voice data and a corresponding visual response, and providing the visual response and the output voice data to the third-party application, the third-party application playing the output voice data synchronous with updating the user interface based on the visual response. Various other methods, systems, and computer-readable media are also disclosed.