Patent attributes
Described herein are systems, methods, and techniques by which a processing unit can build an end-to-end dialogue agent model for end-to-end learning of dialogue agents for information access and apply the end-to-end dialogue agent model with soft attention over knowledge base entries to make the dialogue system differentiable. In various examples the processing unit can apply the end-to-end dialogue agent model to a source of input, fill slots for output from the knowledge base entries, induce a posterior distribution over the entities in a knowledge base or induce a posterior distribution of a target of the requesting user over entities from a knowledge base, develop an end-to-end differentiable model of a dialogue agent, use supervised and/or imitation learning to initialize network parameters, calculate a modified version of an episodic algorithm. e.g., the REINFORCE algorithm, for training an end-to-end differentiable model based on user feedback.