Company attributes
Other attributes
Cerebras is an AI company building computer systems designed to accelerate AI workflows. The company has a team of computer architects, computer scientists, deep learning researchers, business experts, and engineers of all types with the goal of building a new class of computers for AI work. Cerebras states its wafer-scale clusters can train AI models in record time, and the company is developing an HPC (high-performance computing) accelerator, called the CS-2, with 850,000 cores and 40 GB of on-chip memory.
Cerebras's technology has applications in multiple AI fields, such as NLP (natural language processing), CV (computer vision), and HPC, and for a range of industries, including health & pharma, energy, government, scientific computing, financial services, and web & social media. The Cerebras platform has trained a number of models from multi-lingual LLMs to healthcare chatbots. The company helps customers train their own foundation models or fine-tune open-source models like Llama 2. Cerebras open-sourced seven GPT models in March 2023, the first trained using its CS-2 hardware.
Cerebras was founded in 2016 by Andrew Feldman (CEO), Jean-Philippe Fricker (Chief System Architect), Michael James (Chief Architect, Advanced Technologies), Gary Lauterbach (CTO), and Sean Lie (Chief Hardware Architect). Feldman previously cofounded and was CEO of SeaMicro, where he worked with the other four Cerebras cofounders. SeaMicro developed energy-efficient, high-bandwidth microservers and was acquired by AMD in 2012 for $357 million. The company is headquartered in Sunnyvale, California, with locations in San Diego, Toronto, and Bangalore.
Cerebras has raised over $700 million in funding from Alpha Wave, Benchmark, Foundation Capital, Eclipse, Coatue, VY Capital, Altimeter, and angel investors Fred Weber, Ilya Sutskever, Sam Altman, Andy Bechtolshelm, Greg Brockman, Adam D'Angelo, Mark Leslie, Nick Mckeown, David "Dadi" Perlmutter, Salyed Atiq Raza, Jeff Rothschild, Pradeep Sindhu, and Lip-Bu Tan. This total includes $250 million in series F from November 2021, which valued the company at around $4 billion.
First announced in August 2019, the Cerebras Wafer-Scale Engine (WSE) is built from a single wafer of silicon, supplied by TSMC and measures 46,225 mm2 and consists of 1.2 trillion transistors, 400,000 cores and 18 gigabytes of on-chip memory (SRAM). The size of the chip tries to work around AI limitations based on multiple graphics-processing units (GPU) systems. The solutions lose time to information bottlenecks where the WSE should, through its 400,000 cores, communicate quickly and reduce AI learning time. Speaking to Fortune, CEO and cofounder Andrew Feldman said, about WSE:
Every time there has been a shift the computing workload, the underlying machine has had to change.
Previous to Cerebras's chip, chip manufacturers were reluctant to build chips this large. When manufacturing chips, the silicon often has imperfections, which are turned into lower-grade chips or tossed out. Cerebras seeks to work around this process with redundant circuits to work around defects. Cerebras uses TSMC's 16nm node and a 300mm wafer, out of which they cut the largest square. From that, they make their 400,000 sparse linear algebra cores (SLA cores), which are designed for AI deep learning workloads. The 18 gbs of SRAM becomes an aggregate 9 petabytes per second of memory workload.
Cerebras, with TSMC, developed a technique of laying thousands of links across scribe lines. This results in a chip that doesn't behave like 84 processing tiles but instead like 400,000 cores. The built-in redundancy of this technique means Cerebras might reach 100% yield in their manufacturing process.
There were also the issues of thermal expansion on a chip this large, connectivity, and cooling. Cerebras developed a new connector to deliver power through the chips PCB rather than across. To cool the chip, and stop any thermal throttling or thermal expansion that could crack the chip, they developed a water-cooling solution, which punches water onto a copper cold-plate with a micro-fin array to pull heat off the chip. The hot water is then air-cooled in a radiator.
Unveiled in April 2021, the second-generation WSE (WSE-2) powers Cerebras's CS-2 system. The company states it is the largest chip ever built, consisting of 2.6 trillion transistors, 850,000 cores, and 40 gigabytes on-wafer memory.
The Cerebras CS-2 is a purpose-built deep learning system for fast and scalable AI workloads. Cerebras states that CS-2 reduces model training time and inference latency significantly. The CS-2 is powered by Cerebras Wafer-Scale Engine 2 (WSE-2) with 850,000 compute cores, 40GB of on-chip SRAM, 20 PB/s memory bandwidth, and 220Pb/s interconnect. This is housed in purpose-built packaging with cooling and power delivery.
In July 2023, Cerebras introduced the Condor Galaxy 1 (CG-1): a 4 exaFLOPS supercomputer for generative AI. Built in partnership with G42, the CG-1 is the first in a series of nine supercomputers to be built. Cerebras states the nine supercomputers will be completed in 2024, the nine inter-connected supercomputers will have 36 exaFLOPS of AI compute. CG-1 is located in Santa Clara, California, at the Colovore data center.
The Cerebras AI Model Studio is a pay-by-the-model computing service powered by clusters of CS-2s and hosted by Cirrascale Cloud Services. It is a purpose-built platform for training and fine-tuning large language models on dedicated clusters of millions of cores.