A method and system for verifying the results from a machine learning (ML) model hosted by a machine learning as a service (MLaaS) for a MLaaS consumer (MLC), including: receiving a request for benchmark samples from a MLC for the ML model; generating one-time use benchmark samples and benchmark outputs for the ML model; transmitting the benchmark samples to the MLC; receiving from the MLC outputs from the ML model on the benchmark samples; comparing the MLC outputs on the benchmark samples with the generated benchmark outputs to verify the ML model outputs; and sending a verification message to the MLC based on verification.