Patent attributes
A method may include obtaining one or more software-repository packages. A programming-language function may be extracted from the one or more software-repository packages. A curation resource associated with the programming-language function may be identified. The curation resource may include descriptive information related to the programming-language function. The method may include generating a code description corresponding to the programming-language function based on the curation resource. A function-comment pair that includes the programming-language function and the generated code description may be determined. A programming language corpus that includes the one or more software-repository packages may be generated and augmented by the function-comment pair. The method may include training a machine learning model using the programming language corpus.