Patent attributes
There is a need for more effective and efficient text categorization. This need can be addressed by, for example, techniques for semantic text categorization. In one example, a method includes determining an input vector-based representation of an input document; processing the input vector-based representation using a trained supervised machine learning model to generate the categorization based at least in part on the input vector-based representation, wherein: (i) the trained supervised machine learning model has been trained using automatically-generated training data, and (ii) the automatically generated training data is generated by determining an inferred semantic label for each unlabeled training document of one or more unlabeled training documents; and performing one or more categorization-based actions based at least in part on the categorization, and (iii) the labels are described by one or more short documents/short texts.