One of Google's recent Gemini AI models scores worse on safety

According to the company’s internal bench marking, the recently released Google AI model is worse on security tests than its predecessor.

A Technical report Published this week, Google has revealed that its Gemini 2.5 flash model is more likely to produce text that violates its safety guidelines than Gemini 2.0 flash. On two measurements, “text to text safety” and “image to text safety”, Gemini 2.5 flash is registered 4.1 % and 9.6 % respectively.

The text measures the text safety how the model immediately offers Google’s guidelines, while the protection of the text safety from the image assesses how closely the model lives on the limits when indicating using an image. Both tests are automatic, not human surveillance.

In an emailed statement, a Google spokesperson confirmed that the Gemini 2.5 flash “is a worse performance on text to text and photo text safety.”

The results of these amazing benchmarks come when AI companies move their models to further legitimize – in other words, it is less likely to refuse to respond to controversial or sensitive subjects. For its fresh crop of Lama modelsMeta said it had prepared models not to confirm “some ideas about others” and respond to further “debate” political gestures. Openi said earlier this year that it would happen Give the future model Not to take institutional stand and present a number of perspectives on controversial topics.

Sometimes, efforts for these permits have become useless. Tech Crunch reported Monday The default model, which strengthened Openi’s chat GPT, allowed the minors to create sexual conversations. Openi accused the behavior of “Big”.

According to Google’s technical report, Gemini 2.5 flash, which is still in the preview, follows the guidelines with more loyalty than Gemini 2.0 Flash, including instructions that cross the troubled lines. The company claims that reactionary can be partially attributed to a false positive, but it also acknowledges that Gemini 2.5 flash sometimes produces “violating material” when it is clearly asked.

Taxkarnch event

Berkeley, ca
|
June 5 June

The book right now

“Naturally, there is tension between [instruction following] Read the report on sensitive topics and violations of safety policy, which appears in our diagnosis.

Speech map scores, a benchmark in which it is investigated as to how the models respond to sensitive and controversial gestures, also suggest that the Gemini 2.5 flash is less likely to refuse to answer controversial questions compared to Gemini 2.0 flash. The Model Tech Crunch test through the AI platform open -rotor found that it would write extraordinary articles in support of replacing human judges, weakening AI, weakening the protection of the process in the United States and implementing a widespread warrantless government surveillance programs.

Thomas Wood Side, co -founder of the Secure AI project, said that the limited details given by Google in its technical report shows the need for more transparency in model testing.

The Wood Side told Tech Crunch, “There is a trade between the instruction and the policy, as some users can ask for content that will violate policies.” “In this case, Google’s latest flash model further complies with the instructions, while also violating policies. Google does not provide much details on specific issues where policies were violated, though they say they are not tough. Without knowing, it is difficult for independent analysts to know whether it is difficult.

Earlier, Google is set on fire for its model safety reporting methods.

It took the company Week Its highly qualified model, to publish technical reports for Gemini 2.5 Pro. When this report was finally published The key testing of the security test was left out of the details.

On Monday, Google released another detailed report with additional safety information.

Source link

One of Google’s recent Gemini AI models scores worse on safety

Leave a Reply Cancel reply