In a Reddit AMA, OpenAI CEO Sam Altman admitted that one of the major factors preventing the company from shipping products as often as it'd like is a lack of compute capacity.
"All of these models have gotten quite complex," he said in reply to a question on why OpenAI's next AI models were taking so long. "We also face a lot of limitations and hard decisions about [how] we allocated our compute towards many great ideas."
Many reports indicate that OpenAI has been struggling to obtain enough compute infrastructure to run and train its generative models. Just this week, Reuters, citing sources, said that OpenAI has for months been working with Broadcom to create an AI chip for running models, which could arrive as soon as 2026.
Partially for capacity reasons, Altman said, the realistic-sounding conversational feature called Advanced Voice Mode for ChatGPT isn't likely to get vision capabilities soon. In April, the firm displayed the app running on a smartphone, taking visual cues, such as the clothes someone is wearing, in view of the camera on the phone at an OpenAI press event.
More investigation from Fortune found that this demo was released hastily simply to pull the focus back on to OpenAI due to the occurrence of the Google I/O developer conference in the same week. Most insiders in OpenAI doubted whether GPT-4o was actually ready for reveal. This fact can easily be read by looking into the reason behind the one-month delay for the Voice-Only Version of the Advanced Voice Mode.
In the AMA, Altman said that "we don't have a release plan yet" as no launch timeline has been announced for the next major version of OpenAI's image generator, DALL-E. According to Kevin Weil, chief product officer at OpenAI, also participating in the AMA: "sora, OpenAI's video-generating tool has been pushed off because of the need for perfecting the model; getting safety/impersonation/other things right and scale compute".
According to reports, Sora had technical issues that made it bad on its footing in competition from Luma and Runway, among others. In February this year, the first system that was launched took over 10 minutes to process a one-minute video clip, says The Information.
In October this year, one of the co-leads in Sora, Tim Brooks, left for Google.
Later in the AMA, Altman said that OpenAI is still considering allowing "NSFW" content in ChatGPT "someday" ("we totally believe in treating adult users like adults," he wrote), and that the company's top priority is improving its o1 series of "reasoning" models and their successors. OpenAI previewed a number of features coming to o1 at its DevDay conference in London this week, including image understanding.
We've got some pretty strong launches later in the year, Altman wrote. "Nothing that we are going to call GPT-5, though.".