Patent attributes
One or more content items can be received at a data recognition module. The data recognition module can utilize, individually or in any combination, image recognition (e.g., OCR, object recognition, etc.), audio recognition (e.g., speech recognition, music identification, etc.), and/or text recognition (e.g., text crawling) in order to identify or recognize at least a portion of the one or more content items. Based on the identified content portion(s), the one or more content items and/or their respective source(s) can be classified. In one example, an image containing a not yet machine-readable curse word can be included in a source webpage. The image can be received at the data recognition module. The curse word contained in the image can be recognized/identified using an OCR process. Based, at least in part, on the recognized/identified curse word, the image and/or the webpage can be classified as likely being associated with inappropriate material.