Managing lineage information includes: receiving lineage information representing one or more lineage relationships among two or more data processing programs and two or more logical datasets; receiving one or more runtime artifacts, each runtime artifact including information related to a previous execution of a data processing program of the two or more data processing programs; and analyzing the one or more runtime artifacts and the lineage information to determine one or more candidate modifications to the lineage information.