Patent attributes
An embodiment features a method of generating test data. An application-level schema corresponding to a source relational database is received. The schema defines constraints comprising one or more of inter-field, inter-record, and inter-object constraints between related data in the source relational database. A random walk is performed on a graph of nodes representing data in the source relational database. At respective ones of the nodes, corresponding ones of the data in the source relational database are selected along a path ordered in accordance with the constraints defined in the schema. Synthetic test data is generated based on one or more statistical models of the data selected from the source relational database. Data values are generated for respective fields of an object defined in the schema, and data values are generated for records related to the object based on one or more of the constraints defined in the schema.