Patent attributes
A system can include a processor in communication with a data store. The processor can obtain personal identifiable information (PII) data and segregate the PII data into two or more secondary representations. The processor can generate a plurality of co-occurrence matrices based on the two or more secondary representations. The processor can perform a convolution between each of the plurality of co-occurrence matrices and one of a plurality of Gaussian kernels, wherein each of the plurality of Gaussian kernels comprises a different width. The processor can generate a tertiary representation of the PII data by performing a linear combination of the plurality of co-occurrence matrices. The processor can generate a vector based on the tertiary representation and perform a lossy tokenization process on the vector to generate a token. The processor can store the token at the data store.