Patent attributes
An online system receives advertisement requests from one or more advertisers and determines whether an advertisement request includes malicious content before presenting content from the advertisement request to a user. To determine whether the advertisement request includes malicious content, the online system identifies text in the advertisement request, identifies words in the text, and identifies characters in each word. The online system identifies a most common type of character in each word and generates a score for each word based on its constituent characters. For example, a word's score is based on the combination of characters in the word, such as a conditional probability of a word including a type of character given that the word includes a given number of the most common type of character. The scores are analyzed to determine if text in the advertisement request includes malicious content.