Methods and apparatus for profile-guided preloading for virtualized resources are described. A block-level storage volume whose contents are to be populated via data transfers from a repository service is programmatically attached to a compute instance. An indication of data transfers from the repository to a block storage service implementing the volume is obtained, corresponding to a particular phase of program execution at the compute instance. A storage profile is generated, based at least in part on the indication of data transfers. The storage profile is subsequently used to pre-load data from the repository service on behalf of other compute instances.