Patent attributes
Systems and methods are disclosed to implement a machine learning system that is trained to assign annotations to text fragments in an unstructured sequence of text. The system employs a neural model that includes an encoder recurrent neural network (RNN) and a decoder RNN. The input text sequence is encoded by the encoder RNN into successive encoder hidden states. The encoder hidden states are then decoded by the decoder RNN to produce a sequence of annotations for text fragments within the text sequence. In embodiments, the system employs a fixed-attention window during the decoding phase to focus on a subset of encoder hidden states to generate the annotations. In embodiments, the system employs a beam search technique to track a set of candidate annotation sequences before the annotations are outputted. By using a decoder RNN, the neural model is better equipped to capture long-range annotation dependencies in the text sequence.