Product attributes
Other attributes
Stable Video Diffusion (SVD) is a foundational model for generative video from Stability AI, based on its image model Stable Diffusion. SVD is a latent diffusion model trained to generate short video clips from an image taken as a conditioning frame. SVD is capable of text-to-video and image-to-video generation, as well as multiview synthesis via image-to-video fine-tuning. Users provide SVD with a natural language description and receive a short video clip.
The model was released in research preview on November 21, 2023, with the code available on the company's GitHub repository and the weights to run locally on its Hugging Face page. Stable Diffusion also released a research paper going into detail on the technical capabilities of the model. Upon initial release, SVD is intended for research purposes only. It is not intended for real-world or commercial applications as the model was not trained to be factual or provide true representations of people or events.
SVD was initially trained on a dataset of millions of videos before being fine-tuned on a much smaller dataset (hundreds of thousands to around a million clips). SVD can adapt to multiple tasks including multi-view synthesis from a single image with fine-tuning on multi-view datasets. Upon release, SVD offers two forms of image-to-video generation 14 frames and 25 frames at customizable frame rates between 3 and 30 frames per second.