Patent attributes
An analytical computing environment for large data sets comprises a software platform for data management. The platform provides various automation and self-service features to enable those users to rapidly provision and manage an agile analytics environment. The platform leverages a metadata repository, which tracks and manages all aspects of the data lifecycle. The repository maintains various types of platform metadata including, for example, status information (load dates, quality exceptions, access rights, etc.), definitions (business meaning, technical formats, etc.), lineage (data sources and processes creating a data set, etc.), and user data (user rights, access history, user comments, etc.). Within the platform, the metadata is integrated with all platform services, such as load processing, quality controls and system use. As the system is used, the metadata gets richer and more valuable, supporting additional automation and quality controls.