Imagen

imagen.research.google

Is a

Product

Technology

Product attributes

Industry

Artificial Intelligence (AI)

Generative AI

‌

AI image generation

Product Parent Company

Competitors

Overview

Imagen is a family of artificial intelligence (AI) image-generation and editing models developed by Google that allow users to generate images based on natural language prompts. Imagen models are built on advances in transformer-based large language models and multiple diffusion models, that start by generating a small image and progressively increasing the resolution. Google is incorporating Imagen models into a range of its products, including Image generation in Google Slides, Cloud Vertex AI, and Android’s Generative AI wallpaper. There are two main Imagen releases: Imagen 1 and Imagen 2.

With Imagen, users can do the following:

Generate novel images by providing a text prompt
Edit an uploaded or generated image with a text prompt
Edit parts of an uploaded or generated image
Upscale existing, generated, or edited images
Fine-tune a model for a specific type of image generation
Receive text descriptions of images with visual captioning
Receive answers to a question about an image with Visual Question Answering (VQA)

Imagen 1

Imagen is a text-to-image diffusion model built on a large transformer language model capable of understanding natural language prompts provided. Imagen uses a large frozen T5-XXL encoder to convert input text into embeddings. Then a conditional diffution model maps the text embedding into a 64×64 image. Imagen then implements text-conditional super-resolution diffusion models to upsample the image 64×64→256×256 and 256×256→1024×1024. Imagen is trained on the LAION-400M dataset of images.

Imagen 1 was developed by Google Research's Brain Team and announced on May 23, 2022. The announcement was accompanied by a research paper describing the model. However, initially, Google stated Imagen was "not suitable for public use at this time.” In November 2022, Google made Imagen available in a limited form through its AI Test Kitchen app. The release provided two ways to interact with Imagen called "City Dreamer" and "Wobble." City Dreamer lets users construct buildings to create a town, by describing each building. Wobble generates unique creatures based on user descriptions.

Imagen 2

Google announced Imagen 2 at its annual I/O developer conference on May 10, 2023. Google stated the new text-to-image model would be made available via Vertex AI, the Google Cloud enterprise platform that provides access to foundational models, with security and governance controls. The model became generally available on Vertex AI on December 13, 2023, for customers on the approved allowlist. Vertex AI offers access to Imagen 2 via Google-managed infrastructure and privacy and safety features.

Imagen 2 provides improved image quality and a range of new features for different use cases including text rendering, logo generation, and VQA. Imagen was developed using technology from Google DeepMind.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet, Mohammad Norouzi

https://arxiv.org/abs/2205.11487

May 23, 2022

Imagen

Contents

Product attributes

Timeline

Further Resources

References

Find more entities like Imagen