Company attributes
Other attributes
OpenAI is an artificial general intelligence (AGI) research and development company founded in 2015 and based in San Francisco, California, United States. The AI laboratory consists of the original parent company, with the legal name OpenAI Inc., which acts as a nonprofit, and Open AI LP, a for-profit company formed in 2019 that now employs most of its staff. OpenAI develops autonomous systems capable of performing work considered economically valuable. The company focuses on long-term research, working on problems that require fundamental advances in AI capabilities. Research by OpenAI is published at top machine learning conferences. The organization also contributes open-source software tools for accelerating AI research and releases blog posts to communicate its research to others in the field. OpenAI has received criticism both for not sharing its research and for the potential misuse of its AI systems.
OpenAI has developed a number of products:
- ChatGPT—a model capable of generating responses to user questions and interacting in a conversational way
- Dall-E—an AI system capable of generating images from natural language descriptions
- Whisper—a neural net approaching human-level accuracy speech recognition for the English language
The release of ChatGPT in November 2022 brought significant attention to the company, with over 5 million uses in the first five days of launch. A study published in February 2023 estimates that ChatGPT became the fastest-growing consumer application up to that point, with 100 million monthly active users just two months after launch. ChatGPT had roughly 13 million unique visitors in January 2023, over double the numbers of December 2022. Figures released at OpenAI's first developer conference in November 2023 stated one hundred million people are using ChatGPT on a weekly basis, and over two million developers are using the company's API. This includes over 92% of Fortune 500 companies.
OpenAI investors include Microsoft, Reid Hoffman’s charitable foundation, Khosla Ventures, Sequoia Capital, Tiger Global Management, Bedrock Capital, and Andreesen Horowitz. In July 2019, Microsoft partnered with OpenAI to support the company's AGI research and development. This included a $1 billion investment from Microsoft, a partnership to develop a hardware and software platform for Microsoft Azure that will scale to AGI, and Microsoft becoming OpenAI's exclusive cloud provider. In September 2020, Microsoft also announced it had partnered with OpenAI to exclusively license its language model, GPT-3. Microsoft made a second investment in OpenAI in 2021, although details were not released.
A sale of OpenAI stock in 2021 from existing shareholders to investors (including Sequoia Capital, Tiger Global Management, Bedrock Capital, and Andreessen Horowitz) implied a company valuation of nearly $20 billion. Reports in January 2023 suggest Microsoft and OpenAI were in discussions over a new investment of $10 billion that would value the company at $29 billion, making the company one of the most valuable U.S. startups despite not generating significant revenue up to that point. The deal would result in Microsoft owning a 49 percent stake in OpenAI, with a clause meaning Microsoft would receive three-quarters of OpenAI profits until the investment is recovered. On January 23, 2023, Microsoft and OpenAI announced a new multiyear, multibillion-dollar investment. While the financial terms of the partnership were not revealed, Bloomberg reported the figure of $10 billion. In February 2023, Microsoft announced a new AI-powered Bing search engine and Edge browser integrating OpenAI technology. In February 2024, OpenAI completed a deal to sell existing shares in a tender offer led by Thrive Capital. The deal values OpenAI at over $80 billion.
Revenue estimates from 2022 suggested OpenAI expects $200 million in revenue for the following year and $1 billion by 2024. A report from August 2023 showed OpenAI was making $80 million in revenue a month (compared to $28 million in the entirety of 2022). This estimate would mean OpenAI breaks over $1 billion in revenue over the following 12 months.
The company is divided into OpenAI Nonprofit (the original company founded in 2015, now parent company) and OpenAI LP (founded in 2019), a "capped" for-profit company that employs the majority of OpenAI's staff. OpenAI LP allows the company to offer investors and employees a capped return depending on its success. Any returns beyond the capped amount are owned by the parent nonprofit entity.
OpenAI LP is governed by OpenAI Nonprofit's board, which at the company's founding included employees Greg Brockman (chairman and president), Ilya Sutskever (chief scientist), and Sam Altman (CEO), and non-employees Elon Musk, Adam D’Angelo, Reid Hoffman, Will Hurd, Tasha McCauley, Helen Toner, and Shivon Zilis. Founding member Elon Musk left the OpenAI board in February 2018 and is no longer formally involved in OpenAI. As well as overseeing OpenAI LP, OpenAI Nonprofit runs educational programs and hosts policy initiatives.
OpenAI employees are organized into three main areas:
- Capabilities—advancing AI systems
- Safety—ensuring OpenAI's systems align with human values
- Policy—providing appropriate governance of OpenAI's systems
In early November 2023, OpenAI's board was made up of Adam D’Angelo, Helen Toner, Ilya Sutskever, Tasha McCauley, Sam Altman, and Greg Brockman (chairman). On November 17, 2023, Altman and Brockman were removed from the board by the other members, with Altman departing the company and Brockman quitting. After Altman was rehired as CEO on November 21, 2023, a reconfigured board was announced, made up of Bret Taylor (Chair), Larry Summers, and Adam D'Angelo (lone holdover from the previous board). Reports suggest the three-person board plans to vet and appoint an expanded board of up to nine people, with the potential of Microsoft and Altman getting a seat.
While the company only estimated it would require a fraction of the total during its first few years of operations, founding members and investors pledged $1 billion in funding for OpenAI during the announcement of the company. In July 2019, OpenAI received a $1 billion investment from Microsoft, which in return got exclusive rights to license the company's GPT-3 model a year later. The partnership also made Microsoft OpenAI's exclusive cloud provider. While the financial terms were not disclosed, Microsoft made a second investment in the company in 2021.
In January 2022, OpenAI CEO Sam Altman joked about star-tup funding, claiming on Twitter that:
After our pre-friends-and-family round in 2016, our F&F round in 2017, our angel round in 2018, our pre-seed round in 2019, our seed round in 2020, and our seed extension in 2021, we're delighted to share we’ve raised a Series A of $250 million.
Investors in OpenAI include Microsoft, Reid Hoffman’s charitable foundation, Khosla Ventures, Sequoia Capital, Tiger Global Management, Bedrock Capital, and Andreesen Horowitz. Reports on January 9, 2023, originating from Semafor, suggested Microsoft and OpenAI had begun discussions over a new investment of $10 billion. The deal would value the company at $29 billion and result in Microsoft owning a 49% stake in OpenAI. It would also contain a clause that Microsoft would receive three-quarters of OpenAI profits until the investment is recovered. A previous sale of OpenAI stock in 2021 had implied a company valuation of nearly $20 billion.
On January 23, 2023, Microsoft and OpenAI announced the third phase of their partnership, with a new multiyear, multibillion-dollar investment to accelerate AI breakthroughs. Financial terms related to the new partnership were not revealed. However, Bloomberg reported a $10 billion, consistent with previous reporting. Continuing the partnership allows for new advances in AI supercomputing and research, with both companies able to independently commercialize the resulting AI technologies. Key areas for collaboration include those below:
- Microsoft increasing its investment in the development and deployment of supercomputing systems to accelerate OpenAI's AI research
- Deploying OpenAI's models across Microsoft's consumer and enterprise products and introducing new digital experiences based on their AI systems
- Azure remaining OpenAI's exclusive cloud provider
Of the continued partnership, Microsoft CEO Satya Nadella said:
We formed our partnership with OpenAI around a shared ambition to responsibly advance cutting-edge AI research and democratize AI as a new technology platform. In this next phase of our partnership, developers and organizations across industries will have access to the best AI infrastructure, models, and toolchain with Azure to build and run their applications.
OpenAI CEO Sam Altman stated:
The past three years of our partnership have been great. Microsoft shares our values and we are excited to continue our independent research and work toward creating advanced AI that benefits everyone.
In April 2023, VC firms including Sequoia Capital, Andreessen Horowitz, Thrive, and K2 Global bought shares in the company, investing over $300 million at a valuation between $27 billion and $29 billion. The Founders Fund was also involved in the investment. The finalized tender allowed some OpenAI staff members to cash out their holdings. This tender was followed in October 2023, with Thrive Captial leading another deal to buy employee shares. This time at a company valuation of at least $80 billion.
In April 2018, OpenAI published a charter describing the strategy and principles used to execute its mission. The document was developed and refined for two years, including feedback from both internal and external parties. The charter aims to ensure that OpenAI builds safe AGI through systems that benefit all of humanity. It contains the following four principles:
- Broadly distributed benefits—committing any influence OpenAI obtains over AGI development to benefit everyone, preventing the harmful use of AI or AGI, and preventing concentrations of power in the field
- Long-term safety—committing to research that makes AGI safe and helping to ensure safety is broadly adopted across the AI community
- Technical leadership—addressing AGI's impact on society
- Cooperative orientation—cooperating with other research and policy institutions to create a global AGI community working together to solve challenges facing the field
OpenAI was founded on December 11, 2015, as a nonprofit AI research laboratory with the goal to:
advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return.
The company is based in San Francisco, California. At the company's founding, the co-chairs were Elon Musk and Sam Altman (who at the time was president of Y Combinator), the research director was Ilya Sutskever, and the CTO was Greg Brockman (former CTO of Stripe). The other founding members were Trevor Blackwell, Vicki Cheung, Andrej Karpathy, Durk Kingma, John Schulman, Pamela Vagata, and Wojciech Zaremba. Advisors to this group included Pieter Abbeel, Yoshua Bengio, Alan Kay, Sergey Levine, and Vishal Sikka.
The announcement of OpenAI was accompanied by a commitment to $1 billion in funding from founding members Sam Altman, Greg Brockman, and Elon Musk, as well as investors Reid Hoffman, Jessica Livingston, Peter Thiel, Amazon Web Services (AWS), Infosys, and Y Combinator Research. However, the company was only expecting to spend a small fraction of this figure in the first few years of operations.
On April 27, 2016, OpenAI released the public beta of OpenAI Gym, a toolkit for researchers wanting to develop and compare reinforcement learning (RL) algorithms. In June 2016, OpenAI released details on four projects enhancing or using generative models:
- Improving generative adversarial networks (GANS)
- Extensions of GANs such as InfoGAN that can interpret representations for images
- Improving variational autoencoders (VAEs)
- Bayesian neural networks for RL
In November 2016, OpenAI began working with Microsoft for the first time (running most of their large-scale experiments on Azure). On December 5, 2016, the company released Universe, a platform for training AGI across websites, games, and other applications. The software allows the AI agent to use a computer as a human does, looking at the screen's pixels and operating a virtual mouse and keyboard. Universe's release consisted of thousands of environments, including flash games, browser tasks, and other games such as GTA V. The goal is to develop an AI agent that can apply past training experiences on Universe environments to master new and unfamiliar environments.
On March 20, 2017, OpenAI launched Distill, a journal aimed at communicating machine learning results. Research milestones from 2017 included the following:
- Evolution strategies that can rival the performance of RL
- Development of an unsupervised system trained on Amazon reviews that can learn sentiment from text
- A class of RL algorithms called proximal policy optimization
- An improvement to RL algorithms by adding adaptive noise
- The release of highly optimized GPU kernels for neural networks with block-sparse weights
In August 2017, a bot created by OpenAI played 1v1 games of Dota 2 against many top professional players at The International, the annual world championship tournament. OpenAI's bot remained undefeated and played the professional Dendi on the main stage. Dota 1v1 is a complicated game that incorporates hidden information. The company released a video of learned bot behaviors during its games.
After the tournament, OpenAI released a blog detailing the development of its Dota 2 bot. In August 2018, OpenAI Five, a team of five neural networks competed in a series of 5v5 Dota 2 games. OpenAI Five won a best-of-three game against 99.95th percentile Dota players, which included four players that have played Dota 2 professionally. The match took place in front of a live audience and 100,000 concurrent livestream viewers. OpenAI would go on to lose two games against top Dota 2 players at the International in Vancouver.
OpenAI Five held its final live event on April 13, competing against the reigning Dota 2 world Champions, OG, in front of a live audience. OpenAI Five defeated OG in back-to-back games, becoming the first AI to beat the world champions in an esports game. DeepMind’s AlphaStar had previously beaten professional players privately but lost their live matches. OpenAI Five began as a way to develop deep RL algorithms. To create the Dota 2 bots, OpenAI created a system called Rapid that could run proximal policy optimization at a greater scale. After OpenAI Five's losses at The International in 2018, OpenAI upgraded performance with 8x more training compute.
On February 20, 2018, OpenAI announced Elon Musk had departed the company's board to prevent a potential future conflict, as Tesla was also focusing on AI and working on making their electric cars autonomous. The announcement stated that Musk would continue to donate and advise the company. However, Musk stated in 2020 that "I have no control & only very limited insight into OpenAI,” going on to voice his concerns regarding the company's approach to safety.
In April 2018, OpenAI released a charter describing the principles the company will use to execute its mission of achieving AGI while acting in the best interests of humanity. The document reflects strategy developed over two years at the company and was made using feedback from people both internal and external to OpenAI.
On May 25, 2018, OpenAI released Gym Retro, a platform for RL research on games. The company uses Gym Retro to conduct research on RL algorithms and study generalization. The release includes a series of retro games from Sega's Genesis console and Master System and Nintendo’s NES, SNES, and Game Boy consoles. Gym Retro also includes preliminary support for the Sega Game Gear, Nintendo Game Boy Color, Nintendo Game Boy Advance, and NEC TurboGrafx.
On June 11, 2018, OpenAI published a blog on the results of a suite of diverse language tasks. The model, which would go on to be called Generative Pre-trained Transformer (GPT), uses a combination of transformers and unsupervised pre-training. First, a transformer model is trained using a large amount of data in an unsupervised way (using language modeling as a training signal), then the model is fine-tuned on smaller supervised datasets to help solve specific tasks.
On February 14, 2019, OpenAI published details on their next language model GPT-2. A large transformer-based language model with 1.5 billion parameters and trained on 8 million web pages, GPT-2 aims to predict the next word given the previous words in the piece of text. A direct successor to GPT, GPT-2 has more than ten times the parameters and training data. The large-scale unsupervised language model is capable of generating coherent paragraphs of text and performing rudimentary reading comprehension, machine translation, question/answering, and summarization. The initial release of GPT-2 was followed by upgrades in May and August 2019, with the final model release (full 1.5 billion parameter version) being released on November 5, 2019.
On April 25, 2019, OpenAI released a deep neural network capable of generating music compositions called MuseNet. The model can produce four-minute compositions with ten different instruments, combining a range of styles. MuseNet was not explicitly programmed with musical knowledge. Instead, it discovers patterns in harmony, rhythm, and style and learns to predict the next token in hundreds of thousands of MIDI files. MuseNet uses the same unsupervised technology as GPT-2 uses to predict the next token in a text sequence.
On November 21, 2019, OpenAI released Safety Gym, a suite of environments and tools to measure progress toward RL agents. Safety Gym provides a standardized method to compare algorithms and avoid mistakes while learning.
In March 2019, OpenAI LP was formed as a "capped-profit" company to help raise investment and attract talent. OpenAI LP allows the company to offer investors and employees a capped return (up to 100x) based on its success. Any returns beyond the capped amount are owned by the parent nonprofit entity. Since the creation of OpenAI LP, any official correspondence from the company refers to OpenAI LP as simply "OpenAI" with the original parent company referred to as "OpenAI Nonprofit." OpenAI LP now employs most of the company's workforce. As part of the transition to a for-profit company Sam Altman, who was previously co-chair, became CEO. Stepping down from his role as president of Y Combinator.
The idea to change from a nonprofit entity began in March 2017. CTO of the company Greg Brockman and a few other core staff members began drafting a document to plan out a path to AGI. After studying trends in the field, the team came to the realization that staying a nonprofit was financially untenable. The computational resources used by others in the field to achieve breakthroughs were doubling every 3.4 months. To continue on a path towards AGI the company would require significant capital to match or exceed this exponential ramp-up in computational resources. This meant a change in the organizational model to enable rapidly amassing funding. It was the future transition that was behind the formulation of the OpenAI charter (released in April 2018). The charter re-articulated the OpenAI's core values was the planned transition to a for-profit company.
After switching to a capped-profit business, OpenAI's leadership instituted a new pay structure that was based in part on how each employee absorbed the company's mission. This included the culture-related expectations for different levels. Examples include the following:
- Level 3—“You understand and internalize the OpenAI charter.”
- Level 5—“You ensure all projects you and your team-mates work on are consistent with the charter.”
- Level 7—“You are responsible for upholding and improving the charter, and holding others in the organization accountable for doing the same.”
The transition from nonprofit attracted criticism, with many pointing out that the capped return of 100x made the company the same as any other for-profit company. Others worried the company would become less open with its research.
Shortly after the transition, in July 2019, OpenAI and Microsoft announced a partnership to work together on building AGI. This included a $1 billion investment from Microsoft, as well as the following:
- Microsoft and OpenAI jointly building new Azure AI supercomputing technologies
- OpenAI porting its services to run on Microsoft Azure
- Microsoft becoming OpenAI’s preferred partner for commercializing new AI technologies
OpenAI CTO Greg Brockman gave more details on the investment, stating:
It’s a cash investment into OpenAI LP. It uses a standard capital commitment structure, to be called as we need it. We plan to spend it in less than five years, and possibly much sooner.
OpenAI introduced a neural network that generates music called Jukebox on April 30, 2020. The model, which includes rudimentary singing, produces artificially made music in a variety of genres and artist styles outputting music as raw audio. OpenAI released the weights and code as well as a tool to explore the generated samples.
On May 28, 2020, OpenAI researchers submitted a paper describing the company's latest language model GPT-3. Titled "Language Models are Few-Shot Learners," the paper describes an autoregressive language model that has 175 billion parameters and demonstrates its performance for a number of tasks. The model is capable of generating samples of news articles that evaluators found difficult to distinguish from human-written articles.
On June 11, 2020, OpenAI released its first commercial product a private beta of its API to access AI models developed by the company. Developers could gain access to the private beta by requesting access and integrating the API into their own products or develop an entirely new application. The API runs models with weights from the GPT-3 family with speed and throughput improvements. The API provides a "text in, text out" interface, that can be used on virtually any English language task. When provided with any text prompt the API returns a text completion matching the pattern provided. It is programmable by showing it a few examples of what you would like it to do. Users can also improve performance for specific tasks by providing a training dataset of examples or using human feedback.
The API was introduced to act as a revenue source for the company and to help understand its impact in the real world. Additionally, the API makes AI systems more accessible to smaller businesses and helps OpenAI learn how to respond to the misuse of its technology. OpenAI cited safety as the key reason why they chose to control access to their AI systems rather than releasing them as open-source models. The private beta received tens of thousands of applicants. Companies participating in the private beta included Algolia, Quizlet, and Reddit. Nine months after the launch of the OpenAI API, more than 300 applications were using GPT-3 with tens of thousands of developers building on the platform. On average, the API was generating 4.5 billion words a day.
In June 2020, OpenAI published results applying the same techniques behind the GPT-2 language models to images. They showed that a large transformer model trained on pixel sequences can generate coherent image completions and samples.
OpenAI chose to deliberately use the same transformer architecture GPT-2 in language to highlight the potential generative sequence modeling of unsupervised learning algorithms. Image GPT works using a sufficiently large transformer trained on next-pixel prediction to generate diverse samples. OpenAI's work showed that given sufficient compute, a sequence transformer can generate results comparable to convolutional nets for unsupervised image classification.
In September 2020, OpenAI licensed its GPT-3 technology to Microsoft as part of a multiyear partnership announced in July 2019. The agreement allows Microsoft to use GPT-3 for its own products and services and has no impact on access to the model through OpenAI's API.
On January 5, 2021, released information on two projects related to image-generating AI:
- Contrastive Language-Image Pre-training (CLIP)—a neural network developed by OpenAI that can learn visual concepts from natural language descriptions
- Dall-E (pronounced dolly)—a neural network capable of creating images from text captions for a wide range of concepts
CLIP is similar to the "zero-shot" capabilities of OpenAI's GPT-2 and GPT-3 language models, being able to be applied to any visual classification benchmark by providing the names of the visual categories. CLIP was developed to improve a number of pain points in computer vision including labor-intensive vision datasets, high costs involved, the narrow capabilities of vision models (typically only successful at a single task), and poor performance on stress tests. CLIP is trained on a wide variety of images abundantly available on the internet, being instructed in natural language to perform a great variety of classification benchmarks without directly optimizing for performance.
CLIP is used to rerank samples from OpenAI's generative-AI neural network Dall-E, which produces images based on a user's natural language description. A 12-billion parameter version of GPT-3 that was trained using a dataset of text-image pairs. The model has a range of image generation capabilities including creating anthropomorphized versions of animals and objects, combining unrelated concepts in plausible ways, rendering text, and applying transformations to existing images.
Key OpenAI announcements during 2021 included those below:
- Congressman Will Hurd and Helen Toner joining the company's board of directors
- The launch of the OpenAI Startup fund
- Improving OpenAI Codex, an AI system that is able to translate natural language into code (the model that powers GitHub Copilot)
- The OpenAI API becoming available without a waiting list
OpenAI launched the second iteration of its image-generation tool, Dall-E 2, on April 6, 2022. The new model offers significant improvements over the original, which took significant time to return images and had issues producing grainy images. Dall-E 2 produces images with improved realism and accuracy and 4x greater resolution. Dall-E 2 can create original images from text inputs, edit existing images, create variations of existing images, and expand existing images beyond the original canvas. In October 2022, Shutterstock announced a partnership with OpenAI to integrate Dall-E 2 with its existing content and launch a new fund to compensate artists for their contributions.
In August 2022, OpenAI released details on its research approach to making AGI align with human values and follow human intent. This includes an iterative and empirical approach to making AI systems safer.
In September 2022, OpenAI introduced Whisper, a neural net for speech recognition that is approaching human-level robustness and accuracy for the English language. An automatic speech recognition (ASR) system, Whisper was trained on 680,000 hours of multilingual and multitask supervised data from the internet. Using a large and diverse dataset allows the model to improve its robustness to accents, background noise, and technical language. Whisper offers transcription in multiple languages and translation from those languages into English.
On November 30, 2022, OpenAI released ChatGPT a language model chatbot based on GPT-3.5. ChatGPT can provide answers to questions, allowing users to interact in a conversational manner with follow-up questions and responses. The model offers a range of outputs depending on the user's request. This can include code, poems, songs, essays, stories, and more, ChatGPT is noted for its ability to generate material rather than being a source of information. OpenAI plans to deploy ChatGPT iteratively, with the initial release acting as a research preview working to improve its safe use. During the research preview, the model was freely available to users with an OpenAI account. Within five days of ChatGPT's research preview release, it had over a million registered users.
On January 23, 2023, Microsoft and OpenAI announced a new multiyear, multibillion-dollar investment. While financial terms related to the new partnership were not revealed, reports suggested a figure of $10 billion.
In January 2023, OpenAI announced plans on its Discord for a paid version of ChatGPT. Before an official announcement from the company, users began reporting having access to a new pro tier of the AI system priced at $42 a month.
On February 1, 2023, OpenAI announced the launch of ChatGPT Plus, a pilot subscription plan for the conversational AI system. Priced at $20 per month, subscribers will receive:
- General access to ChatGPT, even during peak times
- Faster response times
- Priority access to new features and improvements
ChatGPT Plus is initially only available to US customers, with OpenAI beginning the process of inviting users from their waiting list over the weeks following the launch. OpenAI plans to expand access, add support for additional countries, and launch the ChatGPT API waitlist soon after launching ChatGPT Plus. The subscription version of the popular AI system will be refined with expanded features based on user feedback. During the announcement, OpenAI stated its plans to continue offering free access to ChatGPT, saying:
By offering this subscription pricing, we will be able to help support free access availability to as many people as possible.
On January 31, 2023, OpenAI launched a classifier trained to be able to distinguish between AI-written and human-written text. Considered a "work in progress," the classifier is not fully reliable. OpenAI evaluations on a challenge set of English texts found the classifier:
- correctly identifies 26% of AI-written text (true positives) and
- incorrectly identifies human-written text as AI-written 9% of the time (false positives).
The classifier's reliability improves with the length of the text. OpenAI made the classifier publicly available to gain feedback on whether these kinds of imperfect tools are useful, with the hope to share improved methodologies in the future. The hope is that better-performing classifiers can mitigate the risks of AI-generated text, such as misinformation campaigns and academic dishonesty.
The classifier was trained on a dataset consisting of pairs of human- and AI-written text on the same topic. The dataset came from a variety of sources, including the pretraining data submitted to InstructGPT. The text was divided into a prompt and response, with responses generated using a variety of language models from OpenAI and other organizations.
On February 7, 2023, Microsoft announced new AI-powered versions of its search engine, Bing, and its browser, Edge, that integrate OpenAI technology. The new Bing runs on an OpenAI large language model called the Prometheus Model, which takes key advancements from GPT-3.5 and ChatGPT and customizes them for search. Microsoft refers to these new tools as an "AI copilot for the web," creating a new way to browse the internet with users able to ask Bing questions and receive answers, similar to ChatGPT. During the announcement, Microsoft showed a number of demos, including Bing returning traditional search results with AI annotations and allowing users to talk directly to the Bing chatbot. Unlike ChatGPT, Bing can now retrieve information about recent events.
The new Bing search engine is available as a limited preview on desktop from February 7, 2023. Users can try sample queries and sign up for a waiting list for full access in the future.
Microsoft is also launching two new AI features for the Edge web browser—"chat" and "compose." Embedded within the Edge sidebar, users can ask for a summary of a webpage or document they ask questions about it using the chat function. Compose works as a writing assistant, generating text based on starting prompts.
On March 14, 2023, OpenAI announced its GPT-4, a large multimodal model that can accept both image and text inputs, responding with text outputs. A transformer-based model trained to predict the next token in a document, GPT-4 allows users to specify vision and language tasks. Only the model's text capabilities are initially available, either through ChatGPT Plus or OpenAI's API, with a waiting list for access. OpenAI stated it is working closely with a single partner to prepare the image input capability for public release. Microsoft has since confirmed its AI-enabled Bing was already running an early version of GPT-4. GPT-4 was trained on Microsoft Azure AI supercomputers. Training finished in August 2022 and was followed by six months of model alignment to improve safety.
On August 28, 2023, OpenAI launched ChatGPT Enterprise, a new version of ChatGPT for businesses that offers the following:
- Enterprise-grade security and privacy
- Unlimited higher-speed GPT-4 access
- Longer context windows for processing longer inputs
- Advanced data analysis capabilities
- Customization options
On July 5, 2023, OpenAI announced a new superalignment team to work on the scientific and technical breakthroughs required to control future AI systems that will be smarter than humans. The team will be co-led by Ilya Sutskever and Jan Leike, with the company dedicating 20 percent of its compute power to the project, which it hopes to solve within four years.
On September 20, 2023, OpenAI released Dall-E 3 the next iteration of its text-to-image generation model. Initially released as a research preview, Dall-E 3 became available in ChatGPT Plus and Enterprise on October 19, 2023. OpenAI also announced plans to make the model available via the API and in Labs later in 2023.
Dall-E 3 is integrated with ChatGPT such that users can use the chatbot to generate detailed prompts to guide the model. Dall-E 3 takes advantage of numerous research advancements from within OpenAI and outside of the company. The updated model understands significantly more nuance and detail than the previous iterations, allowing users to translate ideas into more accurate images.
OpenAI states that Dall-E 3 creates images more visually striking and crisper in detail. In particular, rendering intricate details such as text, hands, and faces, as well as supporting both landscape and portrait aspect ratios.
In November 2023, OpenAI held its first developer conference. During the conference, the company made a number of announcements, including:
- GPT-4 Turbo—a more capable and cheaper version of the model that supports a 128K context window.
- Assistants API—a new tool for developers building their own assistive AI apps.
- Multimodal capabilities—the inclusion of vision, image creation (DALL·E 3), and text-to-speech (TTS) capabilities in OpenAI's platform.
- Reduced prices and higher rate limits—lower prices for a number of models including GPT-4 Turbo, GPT-3.5 Turbo, and Fine-tuned GPT-3.5 Turbo 4K, and the doubling of the tokens per minute limit for paying GPT-4 customers.
- Copyright Shield—OpenAI will defend its customers (paying the costs incurred) if they face legal claims around copyright infringement.
- Custom GPTs—a new way for users to create tailored versions of ChatGPT, called GPTs. Without coding, users can begin building chatbots better suited to their needs. User-generated GPTs can be shared publicly, and OpenAI has plans to launch a GPT Store, a searchable tool and leaderbuild of creations from verified builders.
On Friday, November 17, 2023, OpenAI announced Sam Altman had departed the company (leaving his role as CEO and board member) with CTO Mira Murati being appointed interim CEO while the company searched for a permanent successor. The surprise announcement came eleven days after Altman led the keynote at the OpenAI's first DevDay conference. The board of directors for OpenAI, Inc., the 501(c)(3) (original non-profit company) made the announcement following a deliberative review process, concluding that Altman
was not consistently candid in his communications with the board, hindering its ability to exercise its responsibilities. The board no longer has confidence in his ability to continue leading OpenAI.
In their statement, the board of directors went on to say:
OpenAI was deliberately structured to advance our mission: to ensure that artificial general intelligence benefits all humanity. The board remains fully committed to serving this mission. We are grateful for Sam’s many contributions to the founding and growth of OpenAI. At the same time, we believe new leadership is necessary as we move forward. As the leader of the company’s research, product, and safety functions, Mira is exceptionally qualified to step into the role of interim CEO. We have the utmost confidence in her ability to lead OpenAI during this transition period.
At the time of the announcement, OpenAI's board of directors consisted of OpenAI chief scientist Ilya Sutskever, Quora CEO Adam D’Angelo, technology entrepreneur Tasha McCauley, and Georgetown Center for Security and Emerging Technology’s Helen Toner. OpenAI's corporate structure means the board is not tasked with maximizing shareholder value and none of the board members hold equity in the company. OpenAI's announcement also stated that Greg Brockman would be stepping down as chairman of the board but would remain in his role at the company.
However, Brockman announced he had quit the company hours after the announcement. Brockman stated he and Altman were notified of their removal from the board that day from Sutskever. Sources have stated that Sutskever was instrumental in the removal of Altman following a power struggle between the research and product sides of the company. Reports stated that OpenAI employees found out the news from the public announcement. After the news broke, a number of OpenAI employees also resigned from the company Friday evening.
On Saturday, November 18, reports stated Altman was in discussions to return to OpenAI (an idea being pushed by a number of OpenAI investors, including Microsoft) or start a new AI venture. It was also revealed that OpenAI investors were not given advance warning or an opportunity to weigh in on the board’s decision to remove Altman. On Saturday evening, Altman posted on X “i love the openai team so much.” this was followed by a large number of OpenAI employees demonstrating their support by reposting him with a heart emoji, and rumors began of mass resignations at the company if Altman was not reinstated. An internal memo by OpenAI COO Brad Lightcup stated Altman's removal by the board was over a "breakdown of communications," not "malfeasance."
Discussions around Altman's return failed on Sunday, November 19, with the company instead hiring former Twitch CEO Emmett Shear as its interim CEO, replacing Murati. Microsoft CEO Satya Nadella announced Altman would be leading a new AI department alongside Brockman and other recently departed OpenAI employees at the company. Natella posted on X:
We look forward to getting to know Emmett Shear and OAI's new leadership team and working with them,
And:
we're extremely excited to share the news that Sam Altman and Greg Brockman, together with colleagues, will be joining Microsoft to lead a new advanced AI research team.
On Monday, November 20, over 730 of roughly 770 OpenAI employees signed an open letter threatening to leave the company unless the board resigned and reinstated Altman as CEO as well as former president Brockman. The letter reads:
The process through which you terminated Sam Altman and removed Greg Brockman from the board has jeopardized all of this work and undermined our mission and company... Your conduct has made it clear you did not have the competence to oversee OpenAI.
Among the letter's signees were previous interim CEO Murati and Sutskever who posted on X shortly before its release stating:
I deeply regret my participation in the board’s actions. I never intended to harm OpenAI. I love everything we’ve built together and I will do everything I can to reunite the company.
The letter called for the board to resign and for two new independent lead board members, Bret Taylor and Will Hurd, to be appointed. The employees threatened to join Altman's newly announced AI department at Microsoft if their demands were not met.
On Tuesday, November 21, it was announced that Altman and OpenAI had come to an agreement in principle for his return as CEO with a reconfigured board. The announcement from the company, posted on X, read:
We have reached an agreement in principle for Sam Altman to return to OpenAI as CEO with a new initial board of Bret Taylor (Chair), Larry Summers, and Adam D'Angelo. We are collaborating to figure out the details. Thank you so much for your patience through this.
Taylor is the former co-CEO of Salesforce, and Summers is a former US Treasury Secretary. D'Angelo is the lone holdover from the previous board. Altman posted on X after the announcement:
With the new board and w satya's support, i'm looking forward to returning to openai, and building on our strong partnership with msft... i love openai, and everything i've done over the past few days has been in service of keeping this team and its mission together,
Microsoft's Nadella posted:
We are encouraged by the changes to the OpenAI board. We believe this is a first essential step on a path to more stable, well-informed, and effective governance.
Former president Brockman also returns to the company, and both sides have agreed to an investigation into the turmoil at the company. Sources have stated the new three-person board's first job is to vet and appoint an expanded board of up to nine people. With reports that Microsoft and Altman himself want a seat on the new board.
On December 27, 2023, the New York Times sued OpenAI and Microsoft for copyright infringement. The news outlet became the first major American media organization to sue the companies over copyright issues regarding its written work. The lawsuit was filed in Federal District Court in Manhattan. It contends that the companies used millions of articles published by the New York Times to train chatbots that are now in competition with it as a source of reliable information. The lawsuit states the defendants should be held responsible for “billions of dollars in statutory and actual damages” and they should destroy any chatbot models and training data using copyrighted material from the New York Times.
On January 10, 2024, OpenAI introduced the GPT store to ChatGPT Plus, Team, and Enterprise users. The rollout of the GPT store came two months after the company announced custom GPTs at its first developer conference. OpenAI stated that in those two months, users created over three million custom versions of ChatGPT. The store offers a range of GPTs developed by OpenAI's partners and the wider community.
On January 10, 2024, OpenAI launched ChatGPT Team, a new ChatGPT plan designed for teams of all sizes that provides a secure, collaborative workspace. The new plan offers access to OpenAI models, such as GPT-4 and DALL·E 3, as well as tools like Advanced Data Analysis. It also includes a collaborative workspace for team and admin tools plus:
- GPT-4 with 32K context window
- Higher message caps for OpenAI tools
- Custom GPTs in the shared workspace
- Admin console for workspace and team management
- Early access to new features and improvements
ChatGPT Team costs $25/month per user billed annually or $30/month per user billed monthly.
OpenAI announced the text-to-video model Sora on February 15, 2024. Sora is a diffusion model that can output videos up to one minute long based on user prompts. Upon the announcement, OpenAI made Sora available to red teamers, to assess potential areas of harm and risks, and visual artists, designers, and filmmakers to gain feedback on performance.
GPT-3 is an autoregressive language model with 175 billion parameters made by OpenAI. It launched May 29, 2020. The model builds on the transformer-based language model previously made by OpenAI called GPT-2, which had 1.5 billion parameters. GPT-3 improves on GPT-2 by adopting and scaling features present in GPT-2, such as modified initialization, pre-normalization, and reversible tokenization. GPT-3 training can improve scaling up language models primarily through improved task-agnostic and few-shot performance compared to the GPT-2 model. Open AI claims GPT-3 is approaching the performance of SOTA fine-tuned systems for generating quality performance and samples for defined tasks.
According to the GitHub repository of GPT-3, it can achieve strong performance when dealing with NLP datasets, such as translation, question-answering, and cloze tasks. It can also perform on-the-fly reasoning/domain adaptation tasks, like unscrambling words, using novel words in sentences, and performing three-digit arithmetic. The performance of GPT-3 in few-shot learning and training on large web corpora datasets faces methodological issues and produces undesirable results. GPT-3 was founded to be capable of producing news articles that are difficult for humans to distinguish from news articles written by humans.
OpenAI has released improved models within the GPT-3.5 series. These were trained on a blend of text and code from before the end of 2021.
ChatGPT is a variant of OpenAI's popular language generation family of models GPT (Generative Pre-trained Transformer). ChatGPT is designed for chatbot applications and has the ability to generate human-like responses to user input in a conversation, with follow-up questions and responses. The language model is capable of producing code, poems, songs, essays, stories (inspired by specific authors), and more. ChatGPT is generative, completing tasks for users rather than solely being a source of information.
The GPT models were trained on a large dataset of internet text and can generate human-like text for a variety of language tasks. As with GPT, ChatGPT uses a transformer architecture and is trained using unsupervised learning, which means it is able to learn from raw text data without the need for explicit labels or annotations. This allows it to generate text that is highly fluent and human-like, making it well-suited for chatbot applications when the goal is to create a natural and seamless conversation with users.
The original release of ChatGPT was a fine-tuned model from the GPT-3.5 series, adding an additional layer of training called Reinforcement Learning with Human Feedback (RLHF). An initial model utilized supervised fine-tuning with human AI trainers providing conversations in which they played both sides—the user and the AI assistant. Human trainers were given access to model-written suggestions to compose responses. To build a reward model for reinforcement learning, these conversations were collected as comparison data. Model-written messages were randomly selected, sampling several alternative completions, and trainers ranked their quality. This information was fed back into the model using proximal policy optimization. The process was repeated several iterations to improve performance.
A research preview of ChatGPT was released on November 30, 2022, with the model freely available to users with an OpenAI account. On February 1, 2023, OpenAI launched ChatGPT Plus, a paid subscription pilot available to US customers, for $20 per month. ChatGPT Plus offers users benefits that include access to ChatGPT during peak times, faster response times, and priority access to new features. While the free version of ChatGPT utilizes GPT-3.5, ChatGPT Plus is based on GPT-4. On August 28, 2023, OpenAI launched ChatGPT Enterprise, a new version of ChatGPT for businesses that offers enhanced security features and unlimited, high-speed access to GPT-4.
GPT-4 is a large multimodal model with greater performance compared to previous GPT generations. The transformer-based model can accept both text and image inputs performing vision and language tasks based on the user's prompt. GPT-4 remains less capable than humans in many real-world scenarios; however, the model demonstrates human-level performance on a range of professional and academic benchmarks, including performing in the 10% on the uniform bar exam.
Compared to ChatGPT, GPT-4 allows text inputs and produces text outputs with over 8 times the number of words—25,000 words (GPT-4) compared to 3,000 words (ChatGPT). Image input capabilities include understanding images and logical ideas about them, generating captions, classifications, and analyses.
GPT-4 Turbo is a more capable and more up-to-date version of GPT-4. GPT-4 Turbo has knowledge of world events up until April 2023 and a 128k context window (the equivalent of over 300 pages of text in a single prompt). Upon release, OpenAI also stated enhanced optimizations mean they are offering GPT-4 Turbo at a 3x cheaper price for input tokens and 2x cheaper price for output tokens compared to the original GPT-4 model.
Dall-E 2 is an AI system capable of generating images from a natural language description. The name Dall-E is a reference to the artist Salvador Dali and the Pixar character Wall-E. It was chosen to reflect the combination of creativity and technical capabilities the system represents. The model combines concepts, attributes, and styles to create original images and art from a text description, make edits to existing images, create variations of existing images, and expand existing images beyond the original canvas. The model uses a process called "diffusion" to learn the relationship between images and the text describing them. What starts as a pattern of random dots gradually refines toward an image based on the description.
Dall-E 2 offers significant improvements compared to the original Dall-E AI system, which took significant time to return images and had issues producing grainy images. Dall-E 2's images exhibit greater realism and accuracy with 4x greater resolution. Evaluators asked to compare 1,000 image generations from both Dall-E 1 and Dall-E 2, preferred Dall-E 2 for caption matching (71.7%), and photorealism (88.8%).
Dall-E 2 works by training a diffusion decoder to invert a CLIP image encoder. Its inverter is non-deterministic, producing multiple images for a given image embedding. Utilizing an encoder and decoder also allows for applications beyond text-to-image translation, as is the case for Generative Adversarial Network (GAN) inversion producing semantically similar output images when encoding and decoding an input.
Dall-E 3 is the next iteration of OpenAI's text-to-image AI model. Dall-E 3 offers improved capabilities compared to Dall-E 2, understanding more nuance and details to help users better translate their ideas into images. The newer model also integrates with ChatGPT allowing users to utilize the chatbot when generating detailed prompts. OpenAI describes this integration as making use of ChatGPT as a "brainstorming partner," converting simple user-provided sentences into more detailed paragraphs to improve the accuracy of Dall-E 3.
OpenAI states Dall-E 3 generated images are more visually striking and crisper in detail compared to Dall-E 2. Dall-E 3 improvements are particularly noticeable when generating text, hands, and faces. The model also supports both landscape and portrait aspect ratios. These improvements are achieved by training an image captioner to generate better textual descriptions for images. These images with improved captions were then used to train Dall-E 3, producing a model more responsive to user-supplied captions.
OpenAI Codex is an AI system that translates natural language into code. A general-purpose programming model, while results vary, Codex can be applied to effectively any coding task. Codex powers GitHub Copilot, a project built in partnership with GitHub that suggests code and entire functions in real time. Codex is proficient in over a dozen programming languages (Python, JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript, Shell, etc.), interpreting simple natural language commands to execute them on the user's behalf. The model makes it possible to build a natural language interface for existing applications.
A descendent of GPT-3, Codex is trained on both natural language and billions of lines of source code from publicly available sources, including public GitHub repositories. Codex has a similar natural language understanding as GPT-3 but translates this information into working code. This requires the following:
- Breaking up a larger problem into a series of simpler problems
- Mapping the resulting simpler problems to existing code (libraries, APIs, or functions)
Whisper is an automatic speech recognition (ASR) system that is approaching human levels of accuracy for the English language. The model is trained on 680,000 hours of multilingual and multitask supervised data collected from the internet. Using such a large training dataset helps Whisper improve its robustness to accents, background noise, and technical language, enabling transcription in multiple languages and translating from various languages into English.
Whisper's architecture uses an end-to-end approach implemented as an encoder-decoder transformer. Audio is divided into thirty-second chunks before being converted into a log-Mel spectrogram and then passed into an encoder. A decoder is trained to predict the corresponding text caption and perform specific tasks, such as language identification, phrase-level timestamps, multilingual speech transcription, and to-English speech translation. While specialized models show better speech recognition performance, using a large and diverse dataset allows Whisper to be used for a variety of tasks with fewer errors. Roughly a third of Whisper's audio dataset is non-English. OpenAI open-sourced the Whisper model and inference code.
MuseNet is a deep neural network capable of composing four-minute musical pieces with up to ten different instruments and in any genre. The software was not programmed to explicitly understand music. Instead, learning to predict the next "token" in a MIDI file, by applying the same unsupervised technology as GPT-2 to hundreds of thousands of MIDI files. To train, the organization used data from ClassicalArchives and BitMidi. MuseNet can discover patterns in harmony, rhythm, and style. The neural network uses the recompute and optimized kernels of the Sparse Transformer to train a 72-layer network with twenty-four attention heads—over a context of 4096 tokens.
Jukebox is a neural network capable of generating music in a wide variety of genres and styles, including rudimentary singing, as raw audio. While MuseNet explored composing music based on significant amounts of MIDI data, in raw audio, Jukebox has to deal with high diversity and long-range structures. Generating music requires dealing with long sequences. For example, a typical four-minute song at CD quality (44kHz, 16-bit) contains over 10 million timesteps. OpenAI's GPT-2 model had 1,000 timesteps and the company's AI Dota 2 team, OpenAI Five, took tens of thousands of timesteps across an entire game. Therefore, to produce music, Jukebox has to deal with long-range dependencies. This is done using an autoencoder that compresses raw audio to a lower-dimensional space and discards some of the irrelevant bits of information. Jukebox generates audio in this compressed space before upsampling back to the quality needed for raw audio.
Sora is OpenAI's text-to-video AI model capable of generating videos up to a minute long based on user prompts. Sora is a diffusion model that generates videos from a starting point of static noise. It uses a transformer architecture similar to OpenAI's GPT models and builds on previous research from the Dall-E models, in particular, using the recaptioning technique from Dall-E 3 that involves generative descriptive captions for visual training data. Analogous to text tokens for LLMs, Sora uses visual patches, an effective representation of visual data. Patches are scalable and allow generative models to be trained on a range of video and image types. At a high level, Sora turns videos into patches by compressing them into a lower-dimensional latent space and decomposing the representation into spacetime patches.
Sora is a generalist model for visual data, with the ability to generate videos and images of diverse durations, aspect ratios, and resolutions, outputting up to one minute of high-definition video. It can generate entire videos at once, extend previously generated videos to make them longer, add missing frames to an existing video, and animate an existing still to generate a video. Sora-generated scenes can have multiple characters, specific types of motion, details for both the subject and background, and multiple shots within a single generated video with persistent characters and visual style.
The OpenAI API offers access to a range of models with different capabilities and price points, as well as the ability to fine-tune custom models. These include GPT-4, Dall-E, and Whisper. It provides an interface for developers looking to create applications using OpenAI's models. To access these models, users need an OpenAI account and an API key. They can interact with the API through HTTP requests from any language, or via OpenAI's official Python bindings, Node.js library, or a community-maintained library.
Capabilities include those below:
- Text generation
- Function calling
- Embeddings
- Fine-tuning
- Image generation
- Vision
- Text-to-speech
- Speech-to-text
- Moderation
Released in Beta in November 2023, OpenAI introduced the Assistants API at its first developer conference. It allows users to build AI assistants (with instructions and the ability to leverage models, tools, and knowledge, responding to user queries) within their own applications. Use cases include natural language-based data analysis apps, coding assistants, AI-powered vacation planners, and more. The Assistants API is built on the same capabilities that enable OpenAI's GPTs product. Upon release, the Assistants API supports three types of tools:
- Code interpreter—writes and runs Python code in a sandboxed execution environment, with the ability to generate graphs and charts, and process files with diverse data and formatting. It allows assistants to run code iteratively to solve challenging code and math problems, and more.
- Retrieval—augments the assistant with knowledge from outside OpenAI's models, such as proprietary domain data, product information, or documents provided by users. The Assistants API can optimize what retrieval technique to use based on experience building knowledge retrieval in ChatGPT.
- Function calling—enables assistants to invoke functions you define and incorporate the function response in their messages.
Pricing for language model use is based on tokens, representing commonly occurring sequences of characters. One token is approximately 4 characters or 0.75 words for English text.
GPT-4 Turbo:
- gpt-4-1106-preview, input—$0.01 / 1K tokens, output—$0.03 / 1K tokens
- gpt-4-1106-vision-preview, input—$0.01 / 1K tokens, output—$0.03 / 1K tokens
GPT-4:
- gpt-4, input—$0.03 / 1K tokens, output—$0.06 / 1K tokens
- gpt-4-32k, input—$0.06 / 1K tokens, output—$0.12 / 1K tokens
GPT-3.5 Turbo:
- gpt-3.5-turbo-1106, input—$0.0010 / 1K tokens, output—$0.0020 / 1K tokens
- gpt-3.5-turbo-instruct, input— $0.0015 / 1K tokens, output—$0.0020 / 1K tokens
Fine-tuning models:
- gpt-3.5-turbo, training—$0.0080 / 1K tokens, input usage $0.0030 / 1K tokens, output usage $0.0060 / 1K tokens
- davinci-002, training—$0.0060 / 1K tokens, input usage $0.0120 / 1K tokens, output usage $0.0120 / 1K tokens
- babbage-002, training—$0.0004 / 1K tokens, input usage $0.0016 / 1K tokens, output usage $0.0016 / 1K tokens
Embedding models:
- ada v2, usage—$0.0001 / 1K tokens
Base models
- davinci-002, usage—$0.0020 / 1K tokens
- babbage-002, usage—$0.0004 / 1K tokens
Assistants API (each assistant incurs its own retrieval file storage fee based on the files passed to that assistant and the tokens used for the Assistant API are billed at the chosen language model's per-token input/output rates):
- Code interpreter, $0.03 / session
- Retrieval, $0.20 / GB / assistant / day
Image models:
- Dall-E 3 Standard, $0.040 / image (1024×1024), $0.080 / image (1024×1792 and 1792×1024)
- Dall-E 3 HD, $0.080 / image (1024×1024), $0.120 / image (1024×1792 and 1792×1024)
- Dall-E 2 Standard, $0.020 / image (1024×1024), $0.018 / image (512×512), $0.016 / image (256×256)
Audio models:
- Whisper, $0.006 / minute (rounded to the nearest second)
- TTS, $0.015 / 1K characters
- TTS HD, $0.030 / 1K characters
OpenAI offers users the ability to build custom versions of ChatGPT, called GPTs. These custom versions of ChatGPT can be tailored for specific user or enterprise tasks. Creating a tailored GPT requires no coding. GPTs are available to paying ChatGPT Plus subscribers and OpenAI enterprise customers. Demos of the platform include a "creative writing coach" bot that can critique writing samples and a GPT to help attendees of a developer conference. The platform auto-named the bot “Event Navigator,” generated a profile picture for it using DALL-E, and ingested an event schedule to help attendees. Each GPT can be given access to web browsing, DALL-E, and OpenAI’s Code Interpreter tool.
Custom GPTs are available to ChatGPT Plus, Team, and Enterprise users through the GPT store. The store features GPTs developed by OpenAI partners and the community. Visitors can browse popular and trending GPTs on the community leaderboard with categories such as DALL·E, writing, research, programming, education, and lifestyle. The store highlights new and impactful GPTs each week. To share a GPT to the store, builders have to verify their profile and ensure it is compliant with OpenAI's usage policies and brand guidelines. OpenAI has implemented a review system that includes human and automated reviews and the ability for users to report GPTs. US GPT builders can earn money for their work based on user engagement.
OpenAI undertakes research to align its AI systems with its mission to benefit all of humanity. This includes training AI systems to do what humans want and to be helpful, truthful, and safe. A post from August 2022 detailed the company's empirical and iterative approach to aligning its AI systems with human values and human intent. OpenAI aims to push alignment ideas as far as possible and to understand how different approaches succeed or fail. Unaligned AGI poses a significant risk to humanity and finding solutions requires input from a large number of people. Aligning AI systems poses a wide range of socio-technical challenges. Therefore, OpenAI has committed to sharing its alignment research when safe to do so.
OpenAI's alignment research focuses on building a scalable training signal for smart AI systems that align with human intent. There are three main components to the alignment research:
- Training AI systems using human feedback
- Training AI systems to assist human evaluation
- Training AI systems to do alignment research
In July 2023, OpenAI started a new superalignment team, co-led by Ilya Sutskever (cofounder and Chief Scientist) and Jan Leike (Head of Alignment), to work on the scientific and technical breakthroughs required to align future superintelligent AI systems. The company aims to solve the problem within four years and is dedicating 20 percent of its secured compute power over this period to the effort.
Artificial superintelligence (ASI) refers to a theoretical form of AI that surpasses human intellect, manifesting cognitive skills and developing thinking skills of its own. ASI represents a much higher capability level than AGI. While this type of AI is not currently possible, and there is uncertainty over the speed of its development, its potential power could bring significant dangers leading to the disempowerment of humanity. OpenAI is aiming to target breakthroughs for aligning systems much more capable than current models.
OpenAI's goal is to build a roughly human-level automated alignment researcher, then use vast amounts of compute to scale efforts and iteratively align superintelligence. Aligning an automated alignment researcher requires OpenAI to:
- Develop a scalable training method
- Validate the resulting model
- Stress test the entire alignment pipeline
The company plans to leverage AI systems to provide a training signal on tasks that are difficult for humans to evaluate and assist in the evaluation of other AI systems (scalable oversight). They also want to understand and control how OpenAI models generalize oversight to tasks that are not supervised. To validate the alignment of systems, the company automates the search for problematic behavior and problematic internals. Finally, the entire pipeline will be tested by deliberately training misaligned models, and confirming that our techniques detect the worst kinds of misalignments (adversarial testing).
While this is the initial strategy, OpenAI expects the research priorities of the superalignment team to evolve as they start working on the problem and learn more. Solving the ultimate problem of aligning superintelligent systems will require providing evidence and arguments that convince the machine learning and safety communities.
OpenAI has been criticized for not sharing its research, the potential misuse of its AI systems, and releasing its generative AI tools without enough consideration of the ethical and legal problems they create.
The company's language model GPT-2 was originally deemed too dangerous to release, with fears that it could be used to produce large-scale disinformation. This decision received backlash from others in the field. Some said it showed the company walking away from earlier promises of openness and transparency, while others said it was a publicity stunt, with the company overestimating the capabilities of its model. OpenAI critics see the company as repeatedly fueling the AI hype cycle, even mischaracterizing its results.
Other critics believe most of OpenAI's breakthroughs are based on scaling and assembling known techniques (first developed in other labs) and sinking significantly greater computational resources into them. This strategy has been denied by OpenAI founders Greg Brockman and Ilya Sutskever. However, teams at OpenAI have run experiments to test how far AI capabilities can be advanced by training existing algorithms with large amounts of data and compute power. The results of these experiments were hidden from the public for roughly six months with employees explicitly instructed not to reveal them. Those who left signed nondisclosure agreements.
OpenAI's image-generating models, Dall-E and Dall-E 2 have been criticized for threatening the livelihood of artists. Dall-E 2 can replicate the style of specific artists even if they have not consented to their work being used as part of the model's training data. Getty Images banned the upload and sale of any images that were generated using generative AI tools, such as Dall-E 2.
The release of ChatGPT in late 2022, brought significant attention to OpenAI and the capabilities of its new language model. The model was criticized for producing harmful outputs, replacing future human workers, and potential plagiarism, in particular in schools and universities. With significant discussion about the misuse of ChatGPT, Sam Altman tweeted in December 2022 about the model's current capabilities:
ChatGPT is incredibly limited, but good enough at some things to create a misleading impression of greatness. It’s a mistake to be relying on it for anything important right now. It’s a preview of progress; we have lots of work to do on robustness and truthfulness.
In January 2023, OpenAI released details of an AI classifier aiming to be able to distinguish between AI and human-written text. However, the company acknowledges it is impossible to detect all AI-written text. The initial release is considered a work in progress with hopes to gain feedback and share improved methods in the future. The classifier has significant limitations, including only identifying 26% of AI-written text as “likely AI-written,” and labeling human-written text as AI-written 9% of the time. Upon its release, OpenAI stated:
Our classifier has a number of important limitations. It should not be used as a primary decision-making tool, but instead as a complement to other methods of determining the source of a piece of text.
The classifier is instead seen as a preliminary resource for educators.
OpenAI also manages a startup fund to invest in AI companies. Founded on May 26, 2021, with investment from Microsoft and other partners, the OpenAI Startup Fund is investing $100 million to help AI companies it considers to have "a profound, positive impact on the world." The fund is looking to partner with a small number of early-stage startups in fields such as health care, climate change, and education. Managed by a team with expertise in investing, including members of OpenAI leadership and technical staff, the fund will also offer the selected companies early access to future OpenAI systems, support from OpenAI's team, and credits on Microsoft Azure.
In November 2022, the OpenAI Startup Fund launched Converge, a five-week program for engineers, designers, researchers, and product builders using AI to reimagine products and industries. Chosen companies received a $1 million equity investment from the OpenAI Startup Fund, along with early access to OpenAI models and workshops and events with the OpenAI team. The first cohort consisted of roughly ten founding teams across all phases of the seed stage. Converge ran from December 5, 2022, until January 27, 2023, broken down into 2 weeks and 3 weeks of programming on either side of the holidays.
On December 1, 2022, the Open AI Startup Fund announced its first four investments:
- Descript—a video editor using AI to simplify the editing process
- Harvey—an interface for legal workflows using generative language models
- Mem—an AI-powered self-organizing workspace that can predict the information most relevant to users at a given time
- Speak—an AI tutor that can have open-ended conversations with people learning a new language and provide feedback. The company initially launched in East Asia and has over 100,000 paying subscribers
OpenAI publishes its research as papers and presents it at events and conferences. The company's website has a list of published research papers and events at which OpenAI has attended, presented, or demonstrated its technology.
Key papers describing OpenAI models include those listed:
- GPT—Improving Language Understanding by Generative Pre-Training
- GPT-2—Language Models are Unsupervised Multitask Learners
- GPT-3—Language Models are Few-Shot Learners
- GPT-4—GPT-4 Technical Report
- CLIP—Learning Transferable Visual Models From Natural Language Supervision
- Dall-E—Zero-Shot Text-to-Image Generation
- Dall-E 2—Hierarchical Text-Conditional Image Generation with CLIP Latents
- Whisper—Robust Speech Recognition via Large-Scale Weak Supervision