NIST has released a tool designed to assess the risks of AI models.

Re-released: A test bed from the National Institute of Standards and Technology aimed at measuring how malicious attacks, especially ones that "poison" AI model training data, could degrade an AI system's performance.
NIST has released a tool designed to assess the risks of AI models.

Re-released: A test bed from the National Institute of Standards and Technology aimed at measuring how malicious attacks, especially ones that "poison" AI model training data, could degrade an AI system's performance.

Dubbed Dioptra (after the classical astronomical and surveying instrument), this modular, open source web-based tool debuted in 2022 with its intent to assist companies training AI models-and the users applying those models-in assessing, analyzing, and monitoring AI risk. NIST said Dioptra can be used both to benchmark and research models as well as to provide for a common platform by which to expose models to simulated threats in a "red-teaming" environment.

Testing the effects of adversarial attacks on machine learning models is one goal of Dioptra," NIST said in a news release. "The open source software, which can be downloaded for free, could help the community, including government agencies and small to medium-sized businesses, perform evaluations to test AI developers' claims about their systems' performance.".

NIST and NIST's new AI Safety Institute published documents side by side with Dioptra, outlining ways to mitigate some of the dangers of AI, including its potential for misuse in the creation of nonconsensual pornography. This follows the U.K. AI Safety Institute's Inspect, a toolset of comparable intent to assess model capacity and model safety in general. The United States and United Kingdom. It involves an ongoing partnership to co-develop advanced testing of AI models in advance, as made public at the U.K.'s AI Safety Summit in Bletchley Park last November. Dioptra also represents the order issued by President Joe Biden as part of his executive order (EO) on AI, in which an agency such as NIST is obligated to help test AI systems. More generally, the EO does contain provisions about safety and security related to AI, specifying that companies like Apple developing models must provide the federal government with at least thirty days' notice before releasing models to the public. The company must also supply all results of all safety tests conducted.

We've written before about why AI benchmarks are hard - not least of which because today's most sophisticated AI models are black boxes whose infrastructure, training data, and other key details companies that deploy the algorithm keep tightly under wraps.

In part, because current policies let AI vendors selectively decide which evaluations to conduct, a report out this month from the Ada Lovelace Institute, a nonprofit research institute based in the U.K. that studies AI, determined that assessments alone are insufficient to determine whether an AI model is safe and sound in real-world conditions. NIST claims that Dioptra does not altogether de-risk models, but it promises that "Dioptra can shed light on the kinds of attacks that might less-often lead an AI system to be less effective and estimate this impact on its performance.".

In a major limitation, though, Dioptra only works out of the box on models that can be downloaded and used locally, like Meta's expanding Llama family. Models gated behind an API, like OpenAI's GPT-4o, are a no-go—at least for now.

 

Blog
|
2024-11-09 20:25:30