OpenAI is funding academic research into algorithms that can predict humans' moral judgements.
In a tax filing with the IRS, OpenAI Inc., the nonprofit org of OpenAI, reported it gave a grant to Duke University researchers for a project titled "Research AI Morality." OpenAI spokesman pointed to a press release stating that the award is part of a larger three-year, $1 million grant for Duke professors who study "making moral AI.
Little is publicly known about this "morality" research OpenAI is financing besides the fact that the grant expires in 2025. The study's principal investigator, Walter Sinnott-Armstrong, professor of practical ethics at Duke, told TechCrunch by email that he "will not be able to talk" about the work.
Sinnott-Armstrong and the project's co-investigator, Jana Borg, have published several studies, and a book, on the possibility that AI can work as a "moral GPS" to help humans make better judgments. Among larger teams, they have developed a "morally aligned" algorithm that will make recommendations on who should receive kidney donations, and on what circumstances people would want AI to decide moral questions.
A New Framework for Common Sense: Learning to Predict Human Moral Judgments," funded by OpenAI, is intended to make algorithms predict human moral judgments on how to behave in situations in which conflicts arise "among morally relevant features in medicine, law, and business."
But it's far from clear that such a nuanced concept as morality is within the reach of today's tech.
In 2021, the nonprofit Allen Institute for AI developed a tool called Ask Delphi, which was envisioned to provide ethical and morally sound advice. It scored basic moral dilemmas well enough-the bot "knew" cheating on an exam was wrong, for instance. However, rephrasing and rewording the question a little was enough to get Delphi to condone just about anything, including smothering infants.
The explanation lies in the nature of how contemporary AI systems function.
Machine learning models are statistical machines. That is, they learn to make predictions from patterns in a lot of examples they were trained on, which come from all over the web: The phrase "to whom" often comes before "it may concern."
AI doesn’t have an appreciation for ethical concepts, nor a grasp on the reasoning and emotion that play into moral decision-making. That’s why AI tends to parrot the values of Western, educated, and industrialized nations — the web, and thus AI’s training data, is dominated by articles endorsing those viewpoints.
Not surprisingly, many people's values aren't represented in the answers an AI provides, especially if those people aren't helping train the AI sets by posting online. Moreover, AI absorbs a variety of biases far beyond a Western slant. Delphi said that being straight is more "morally acceptable" than being gay.
So, the intrinsic subjectivity of morality makes the challenge before OpenAI- together with the researchers it's backing- all the more intractable: philosophers have been debating the merits of various ethical theories for thousands of years, and there's no widely applicable framework in sight.
Claude favors Kantianism (absolute moral rules), while ChatGPT tilts every so slightly utilitarian (greatest good for greatest number of people). Which is better? Ask different people.
An algorithm seeking to predict how humans will judge some action in terms of right and wrong has to account for all this. That's a pretty high bar to clear, assuming such an algorithm is possible in the first place.