Patent attributes
A tamper-resistant text stream watermarking system is provided. Content such as any text-based document including programming code is encoded with a watermarking mechanism. The mechanism modifies the text itself according to a preset repeating pattern without changing the substance. Examples include patterned use of white spaces, contractions, abbreviations, order of local variables in programming code, and the like. The pattern may include a binary fingerprint that can be used to trace the watermarked document to an assigned source or version of the original document. In analyzing a suspect text stream, patterns are generated based on instances of the mechanism and their corresponding bit values. Repeating patterns are combined into a bit stream along with separators between each pattern. The bit stream can then be analyzed to determine a source of the watermarked text stream.