Skip to content

MSN Technology

Tech Solutions for a Smarter World

Menu
  • About MSN Technology
  • Contact Us
  • Write for Us
Menu
robot lying 2 1152x648

Researchers concerned to find AI models hiding their true “reasoning” processes

Posted on April 11, 2025

robot lying 2

Remember when the teachers demanded that you “show your job” at school? Some fancy new AI models promise to do it exactly, but New research He suggests that they sometimes hide their original methods while instead of fabricating wide explanations.

Anthropic’s New Research-Cheett GPT such as Claude AI Assistant Creator-Natalizing reasoning (SR) models for example R1 of DPSECAnd his own cloud series. In a research dissertation Posted last weekAnthropic’s alignment science team proves that these SR models often fail to disclose when they take external support or shortcut despite the features designed to show their “reasoning” process.

(It is worth noting that the Open O1 and O3 Series SR models deliberately disrespect the accuracy of their “thinking” process, so this study does not apply to them.)

To understand SR models, you need to understand the concept called “China of Thought” (or COT). The CO AI acts as a moving interpretation of the model’s fake thinking process as it solves a problem. When you ask a complex question from one of these models, the COT process shows every step in which the model comes to the conclusion – how can a person argue through every thought through the puzzle, pieces.

The AI ​​model has been precious to the researchers of “AI safety” who monitor the system’s internal operations, but also to produce more accurate results, but also to produce more accurate results to create these steps. And for example, this red out of “ideas” should be worth (understandable for humans) and loyal (to accurately reflect the actual reasoning of the model).

The Anthropic research team writes, “In a perfect world, everything will be understandable to readers in thinking, and it will be loyal-it will be true to explain what the model was thinking as soon as he reached the answer.” However, his experiences focusing on loyalty shows that we are far from this ideal scene.

Specifically, research shows that even when models such as anthropic Claude 3.7 Swant Experimidally prepared a response using the information provided – such as the correct choice (whether correct or deliberately misleading) suggests the “unauthorized” shortcut.

Source link

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Google’s Will Smith double is better at eating AI spaghetti … but it’s crunchy?
  • Desktop Survivors 98 is more than just a retro Windows nostalgia trip
  • CDC can no longer help prevent lead poisoning in children, state officials say
  • FDA advisors for COVID shots left in the dark over how they’ll be regulated
  • Just 3 days left to save up to $900 on your Disrupt 2025 pass

Recent Comments

  1. How to Make a Smart Kitchen: The Ultimate Guide - INSCMagazine on Top Smart Cooking Appliances in 2025: Revolutionizing Your Kitchen
  2. Top Smart Cooking Appliances in 2025: Revolutionizing Your Kitchen – MSN Technology on Can I Control Smart Cooking Appliances with My Smartphone?
  3. Venn Alternatives for Remote Work: Enhancing Productivity and Collaboration – MSN Technology on Top 9 AI Tools for Data Analytics in 2025
  4. 10 Small Business Trends for 2025 – MSN Technology on How To Extending Your Business Trip for Personal Enjoyment: A Guide

Archives

  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024

Categories

  • Business
  • Education
  • Fashion
  • Home Improvements
  • Sports
  • Technology
  • Travel
  • Uncategorized
©2025 MSN Technology | Design: Newspaperly WordPress Theme
Menu
  • About MSN Technology
  • Contact Us
  • Write for Us
  • Go to mobile version