Product attributes
Imagen is a family of artificial intelligence (AI) image-generation and editing models developed by Google that allow users to generate images based on natural language prompts. Imagen models are built on advances in transformer-based large language models and multiple diffusion models, that start by generating a small image and progressively increasing the resolution. Google is incorporating Imagen models into a range of its products, including Image generation in Google Slides, Cloud Vertex AI, and Android’s Generative AI wallpaper. There are two main Imagen releases: Imagen 1 and Imagen 2.
With Imagen, users can do the following:
- Generate novel images by providing a text prompt
- Edit an uploaded or generated image with a text prompt
- Edit parts of an uploaded or generated image
- Upscale existing, generated, or edited images
- Fine-tune a model for a specific type of image generation
- Receive text descriptions of images with visual captioning
- Receive answers to a question about an image with Visual Question Answering (VQA)
Imagen is a text-to-image diffusion model built on a large transformer language model capable of understanding natural language prompts provided. Imagen uses a large frozen T5-XXL encoder to convert input text into embeddings. Then a conditional diffution model maps the text embedding into a 64×64 image. Imagen then implements text-conditional super-resolution diffusion models to upsample the image 64×64→256×256 and 256×256→1024×1024. Imagen is trained on the LAION-400M dataset of images.
Imagen 1 was developed by Google Research's Brain Team and announced on May 23, 2022. The announcement was accompanied by a research paper describing the model. However, initially, Google stated Imagen was "not suitable for public use at this time.” In November 2022, Google made Imagen available in a limited form through its AI Test Kitchen app. The release provided two ways to interact with Imagen called "City Dreamer" and "Wobble." City Dreamer lets users construct buildings to create a town, by describing each building. Wobble generates unique creatures based on user descriptions.
Google announced Imagen 2 at its annual I/O developer conference on May 10, 2023. Google stated the new text-to-image model would be made available via Vertex AI, the Google Cloud enterprise platform that provides access to foundational models, with security and governance controls. The model became generally available on Vertex AI on December 13, 2023, for customers on the approved allowlist. Vertex AI offers access to Imagen 2 via Google-managed infrastructure and privacy and safety features.
Imagen 2 provides improved image quality and a range of new features for different use cases including text rendering, logo generation, and VQA. Imagen was developed using technology from Google DeepMind.