Is a
Patent attributes
Current Assignee
Patent Jurisdiction
Patent Number
Date of Patent
March 29, 2011
Patent Application Number
12822439
Date Filed
June 24, 2010
Patent Citations Received
Patent Primary Examiner
Patent abstract
A training procedure for N-gram based statistical document classification has been disclosed. In one embodiment, a set of N-grams is selected out of a second set of N-grams, each of the N-grams having a sequence of N bytes, where N is an integer. Then a statistical content classification model is generated based on occurrences of the N-grams, if any, in a set of training documents and a set of validation documents. The statistical content classification model is provided to content filters to classify content.
Timeline
No Timeline data yet.
Further Resources
No Further Resources data yet.