Is a
Patent attributes
Current Assignee
Patent Jurisdiction
Patent Number
Patent Inventor Names
Thomas E. Raffill0
John Gmuender0
Roman Yanovsky0
Shunhui Zhu0
Boris Yanovsky0
Date of Patent
September 7, 2010
0Patent Application Number
118817700
Date Filed
July 27, 2007
0Patent Primary Examiner
Patent abstract
A training procedure for N-gram based statistical document classification has been disclosed. In one embodiment, a set of N-grams is selected out of a second set of N-grams, each of the N-grams having a sequence of N bytes, where N is an integer. Then a statistical content classification model is generated based on occurrences of the N-grams, if any, in a set of training documents and a set of validation documents. The statistical content classification model is provided to content filters to classify content.
Timeline
No Timeline data yet.
Further Resources
No Further Resources data yet.