Patent attributes
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for automatic text report concept generation. Generating concepts from text reports includes receiving a collection of text reports; performing a clustering process for a plurality of different cluster sizes; evaluating each of the plurality of different cluster sizes to select an optimal cluster size; generating, from the collection of text reports, clusters using the selected optimal cluster size; aggregating text associated with text reports in each cluster; maintaining a training dataset comprising the aggregated text; and generating a predictive model from the training dataset to generate a concept for an input text report.