US Patent 9645988 System and method for identifying passages in electronic documents

The methods proposed here deconstructs training sentences into a stream of features that represent both the sentences and tokens used by the text, their sequence and other ancillary features extracted using natural language processing. Then, we use a conditional random field where we represent the concept we are looking for as state A and the background (everything not concept A) as a state B. The model created by this training phase is then used to locate the concept as a sequence of sentences within a document. This has distinct advantages in accuracy and speed over methods that individually classify each sentence and then use a secondary method to group the classified sentences into passages. Furthermore while previous methods were based on searching for the occurrence of tokens only, the use of a wider set of features enables this method to locate relevant passages even though a different terminology is in use.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 9645988 System and method for identifying passages in electronic documents

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 9645988 System and method for identifying passages in electronic documents