A method for extracting information from electronic documents, including: learning terms and term variants from a training corpus, wherein the terms and the term variants correspond to a specialized dictionary related to the training corpus; generating a list of negative indicators found in the training corpus; performing a partial match of the terms and the term variants in a set of electronic documents to create initial match results; and performing a negation test using the negative indicators and a positive terms test using the terms and the term variants on the initial match results to remove matches from the initial match results that fail either the negation test or the positive terms test, resulting in final match results.