Patent attributes
Embodiments described herein enable data associated with a large plurality of users to be analyzed without compromising the privacy of the user data. In one embodiment, a user can opt-in to allow analysis of clear text of the user's emails. An analysis process can then be performed in which an analysis service receives clear text of an email of a client device; processes the clear text of the email into one or more tokens having one or more tags; enriches one or more tokens in the processed email using data associated with a user of the client device and the one or more tags; and processes the clear text and one or more enriched tokens to generate a data set of one or more feature vectors.