Patent attributes
Provided is a system, method and computer-readable medium for generating data that may be used to train models for a natural language processing application. A system architect creates a plurality of sentence patterns that include entity variables and initiates sentence generation. Each entity is associated with one or more entity data sources. A language generator accepts the sentence patterns as inputs, and references the various entity sources to create a plurality of generated sentences. The generated sentences may be associated with a particular class and therefore used to train one or more statistical classification models and entity extraction models for associated models. The sentence generated process may be initiated and controlled using a user interface displayable on a computing device, the user interface in communication with the language generator module.