Patent attributes
This disclosure describes techniques that include identifying sensitive information from any appropriate set of data, such as data produced by operations of a business or organization. In one example, this disclosure describes a method that includes receiving text data containing sensitive information, including structured sensitive information and unstructured sensitive information; applying a rule-based model to identify the structured sensitive information in the text data; applying a machine learning model to identify the unstructured sensitive information in the text data, wherein the machine learning model has been trained to identify unstructured sensitive information in text; and generating output text data from the text data by modifying the structured sensitive information identified by the rule-based model and the unstructured sensitive information identified by the machine learning model.