Methods, systems and computer program products are described herein that provide a serverless, multi-engine, multi-user data lake indexing subsystem and application programming interface. Indexes are defined as derived datasets and stored on the data lake in a universal format that enables disparate engines to create and/or discover indexes for workload optimization. Embodiment of indexes enable stateful control and management of an index via metadata included in the index and stored on the data lake.