Patent attributes
In an approach for anonymizing data, a processor receives a mixed-type dataset with at least two relational attributes and at least one textual attribute. A processor runs the mixed-type dataset through a text annotator to discover a set of personally identifiable information (PII). A processor creates a set of ghost attributes to add to the mixed-type dataset. A processor anonymizes data of the at least two relational attributes and the set of ghost attributes. A processor replaces each PII in the textual attribute with the corresponding anonymized data in the at least two relational attributes or the set of ghost attributes to create an anonymized mixed-type dataset. A processor removes the set of ghost attributes from the anonymized mixed-type dataset. A processor shuffles records of the anonymized mixed-type dataset to create a shuffled anonymized mixed-type dataset. A processor outputs the shuffled anonymized mixed-type dataset.