US Patent 10573295 End-to-end speech recognition with policy learning

The disclosed technology teaches a deep end-to-end speech recognition model, including using multi-objective learning criteria to train a deep end-to-end speech recognition model on training data comprising speech samples temporally labeled with ground truth transcriptions. The multi-objective learning criteria updates model parameters of the model over one thousand to millions of backpropagation iterations by combining, at each iteration, a maximum likelihood objective function that modifies the model parameters to maximize a probability of outputting a correct transcription and a policy gradient function that modifies the model parameters to maximize a positive reward defined based on a non-differentiable performance metric which penalizes incorrect transcriptions in accordance with their conformity to corresponding ground truth transcriptions; and upon convergence after a final backpropagation iteration, persisting the modified model parameters learned by using the multi-objective learning criteria with the model to be applied to further end-to-end speech recognition.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 10573295 End-to-end speech recognition with policy learning

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 10573295 End-to-end speech recognition with policy learning