Data-labeling startup Scale AI has raised $1 billion, leading to a doubling of its valuation to $13.8 billion.

Scale AI, a company that offers data-labeling services for businesses training machine learning models, has successfully raised $1 billion in a Series F funding round from notable institutional and corporate investors, including Amazon and Meta.
Data-labeling startup Scale AI has raised $1 billion, leading to a doubling of its valuation to $13.8 billion.

Scale AI, a company that offers data-labeling services for businesses training machine learning models, has successfully raised $1 billion in a Series F funding round from notable institutional and corporate investors, including Amazon and Meta. 

This funding round is a combination of primary and secondary financing and is part of a broader trend of significant venture capital investments in the AI sector. Recently, Amazon finalized a $4 billion investment in Anthropic, a competitor to OpenAI, while other companies like Mistral AI and Perplexity are also in the midst of raising billion-dollar rounds at high valuations.

Prior to this latest round, Scale AI had raised approximately $600 million over its eight-year history, including a $325 million Series E in 2021 that valued the company at around $7 billion—essentially doubling its valuation from its Series D round in 2020. Now, despite facing challenges that led to a 20% workforce reduction last year, Scale AI’s valuation has soared to $13.8 billion, reflecting the intense competition among investors to capitalize on the burgeoning AI market.

The Series F funding round was led by Accel, which also spearheaded the company’s Series A and participated in subsequent funding rounds. 

In addition to Amazon and Meta, Scale AI has attracted a diverse group of new investors, including the venture arms of Cisco, Intel, AMD, and ServiceNow, as well as DFJ Growth, WCM, and investor Elad Gil. Many of its existing investors also participated, including Nvidia, Coatue, Y Combinator, Index Ventures, Founders Fund, Tiger Global Management, Thrive Capital, Spark Capital, Greenoaks, Wellington Management, and former GitHub CEO Nat Friedman.

**Banking on the Growing Importance of Data**

Data serves as the foundation of artificial intelligence, making companies that specialize in data management and processing increasingly successful. Recently, Weka announced it raised $140 million, achieving a post-money valuation of $1.6 billion to assist companies in building data pipelines for their AI applications.

Founded in 2016, Scale AI combines machine learning with "human-in-the-loop" oversight to manage and annotate substantial volumes of data, which is crucial for training AI systems in various sectors, including autonomous vehicles.

However, most data is unstructured, making it challenging for AI systems to utilize it effectively without preprocessing. This data must be labeled, a resource-intensive task, particularly with large datasets. Scale AI provides companies with accurately annotated data that is ready for model training. The company tailors its services to various industries, recognizing that a self-driving car company will require labeled data from cameras and lidar, while natural language processing (NLP) applications will need annotated text.

Scale AI's clientele includes major players such as Microsoft, Toyota, GM, Meta, the U.S. Department of Defense, and, as of last August, OpenAI, the creator of ChatGPT, which is leveraging Scale AI to allow companies to fine-tune its GPT-3.5 text-generating models.

The company plans to utilize the newly raised funds to accelerate the availability of "frontier data," which it views as essential for achieving artificial general intelligence. 

“Data abundance is not the default — it’s a choice,” stated Scale AI’s CEO and co-founder, Alexandr Wang, in a press release. “It requires bringing together the best minds in engineering, operations, and AI. Our vision is one of data abundance, where we have the means of production to continue scaling frontier large language models (LLMs) by many more orders of magnitude. We should not be data-constrained in reaching GPT-10.”

Blog
|
2024-09-30 19:48:49