Generating a table of contents from a computer document is disclosed. The computer document is converted into a markup language, from which a list of grouped textblocks is generated. Headings are detected from among the list of grouped textblocks. For a grouped textblock, a first vector corresponding to a semantic representation of the grouped textblock and a second vector based on evaluation of pre-defined features in the grouped textblock are generated. Based on the first and second vectors, the grouped textblock is classified as a heading or a plain-text using a trained classifier.