There is no AI without data; no AI without unstructured data; and no AI without unstructured data at scale, said Chet Kapoor, chairman and CEO of data management company DataStax.
Kapoor was moderating a conversation at TechCrunch Disrupt 2024 on "new data pipelines" in the context of modern AI applications, featuring Vanessa Larco, a partner at VC firm NEA; and George Fraser, CEO of data integration platform Fivetran. As was evident from the chat itself, which touched on aspects ranging from data quality and real-time data in the generation of AI to, importantly, where one might say that AI is actually at this stage - well, still pretty early in its development and thus must put product-market fit first, not scale. The advice for companies looking to jump into the dizzying world of generative AI is pretty straightforward: don't try to do too much too soon, and focus on practical, incremental progress. Why? We're really still figuring it all out.
"The most important thing for generative AI is that it all comes down to the people," Kapoor said. "The SWAT teams that actually go off and build the first few projects — they are not reading a manual; they are writing the manual for how to do generative AI apps."
While it is a given that data and AI go hand in hand, getting a company's data under its own control, some sensitive and protected to the max, and possibly stored on multiple locations can be just overwhelming. Larco works with, and sits on the board of, countless startups across the B2C and B2B spectrum; he recommended a rather simple but pragmatic approach toward unlocking true value in such early days.
"Work backwards for what you're trying to accomplish — what are you trying to solve for, and what is the data that you need?" Larco said. "Find that data, wherever it resides, and then use it for this purpose."
This is in contrast to a splash of generative AI across the whole company from day one, throwing all the data at the LLM and hoping that it comes out the right thing by the end. That's going to create an incorrect, expensive mess, Larco said. "Start small," she said. "Just what we're seeing," Fraser said, "companies starting small, with an internal application, with pretty specific goals, and finding the data that matches exactly what they're trying to accomplish."
Fraser founded the "data movement" platform Fivetran 12 years ago, winning big-name customers like OpenAI and Salesforce along the way. Companies must focus narrowly on real problems they are facing in now, he said.
"Only solve the problems you have today; that is the mantra," Fraser said. "The costs of innovation are always 99 percent in things you build that did not work out, not in things that worked out that you wished you had planned for scale ahead of time. While those are the problems that we always think about with the hindsight, those aren't the 99% of the cost you bear.
So much like the early days of the web and, more recently, the smartphone revolution, early applications and use cases for generative AI have shown glimpses of a powerful new AI-enabled future. But so far, they haven't necessarily been game-changing.
"I call this the Angry Birds era of generative AI," Kapoor said. "It's not completely changing my life, no one's doing my laundry yet.". This year, each of the enterprises I am working with is putting something into production — small, internal, but putting it into production because they are actually working out the kinks on how to form the teams to go and make this happen. Next year is what I call the year of transformation when people start doing apps that actually change the trajectory of the company that they work for.