Patent attributes
Certain aspects of the present disclosure provide techniques for detecting and protecting personally identifiable information. In one example, a method includes retrieving a user-specific dataset from a multi-user dataset; filtering the user-specific dataset to create a user-specific data subset; determining a user frequency of each user-specific token of a plurality of user-specific tokens in the user-specific data subset; determining a multi-user frequency for each user-specific token of the plurality of user-specific tokens in the multi-user dataset; computing a frequency ratio based on the user-specific frequency and the multi-user frequency of each user-specific token of the plurality of user-specific tokens; and protecting each user-specific token whose frequency ratio is above a frequency ratio threshold.