Startup Mistral, specialising in AI, has released an API for content moderation.
In other words, the exact same API that can be used for moderation of Mistral's Le Chat chatbot platform can easily be customized to a variety of purposes and tailored to varying degrees of safety, says Mistral. It is powered by a model fine-tuned to classification tasks in English, French, and German on one of nine categories: sexual, hate and discrimination, violence and threats, dangerous and criminal content, self-harm, health, financial, law, and personally identifiable information.
According to Mistral, the moderation API can be applied to raw or conversational text.
Over the last several months, we had seen growing industry and academic research interest in new AI-based moderation systems that promise to make moderation scalable and more robust across applications, Mistral said in a blog post. "Our content moderation classifier uses the most relevant policy categories for effective guardrails and introduces a pragmatic approach to model safety by addressing model-generated harms like unqualified advice and PII.".
Theory is useful: in theory. But it has the same biases and technical flaws as all AI.
For example, some toxicity-detecting models consider phrases in African American Vernacular English, colloquial grammar used by some Black Americans, to have had disproportionate "toxic" value. Other research has identified that posts about people with disabilities across social media platforms are frequently flagged as somehow more negative or toxic by commonly applied public sentiment and toxicity detection models.
Mistral says that its moderation model is extremely accurate — yet concedes it is a work in progress. The company did not compare its API's performance to other popular moderation APIs, such as Jigsaw's Perspective API or OpenAI's moderation API.
We are collaboratively building and sharing scalable, lightweight, and customizable tooling with our customers for moderation purposes, said the company. The firm will also continue to engage with the research community to contribute advancements in safety to the broader field.
Mistral also introduces a batch API. The company says that its API will decrease the cost of models by 25% when served as it can process high-volume requests asynchronously. Anthropic, OpenAI, Google, and many others provide batching options for their AI APIs.