Close Menu
timesmoguls.com
  • News
  • Entertainment
  • Politics
  • Business
  • Tech
  • Lifestyle
  • Health
  • Science
  • Sports
Featured

The liberals promised a series of crime measures. Here is what they take – national

The Ford government plans to arm more special constables but will not say who will get firearms

The 70 -year -old victim of the random assault of Vancouver says that the striker “ the body criticized him – BC

Subscribe to Updates

Get the latest news from timesmoguls.

Facebook X (Twitter) Instagram
  • Home
  • About us
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and services
Facebook X (Twitter) Instagram Pinterest
timesmoguls.com
Contact us
HOT TOPICS
  • News
  • Entertainment
  • Politics
  • Business
  • Tech
  • Lifestyle
  • Health
  • Science
  • Sports
timesmoguls.com
You are at:Home»Technology»Anthropic can now follow the bizarre internal functioning of a large model
Technology

Anthropic can now follow the bizarre internal functioning of a large model

March 29, 2025003 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
Anthropic Rabbit Hole.jpg
Share
Facebook Twitter LinkedIn Pinterest Email

Strange behavior

So: What did they find? Anthropic examined 10 different behaviors in Claude. One implied the use of different languages. Does Claude have a part that speaks French and another part that speaks Chinese, and so on?

The team noted that Claude used components independent of any language to answer a question or solve a problem, then chose a specific language when it answered. Ask him “What is the opposite of little?” In English, French and Chinese and Claude will first use non -neutral components linked to “smallness” and “opposites” To find an answer. It is only then that he will choose a specific language in which to respond. This suggests that large languages ​​can learn things in a language and apply them in other languages.

Anthropic also examined how Claude has solved simple mathematical problems. The team noted that the model seems to have developed its own internal strategies that do not resemble those it will have seen in its training data. Ask Claude to add 36 and 59 and the model will go through a series of stages, including adding a selection of approximate values ​​(add 40,Hish and 60,Hish, add 57 and 36,ishish). Towards the end of its process, it offers the value 92ISH. Meanwhile, another sequence of steps focuses on the latest figures, 6 and 9, and determines that the answer must end with a 5. The fact of putting this with 92ish gives the right answer of 95.

And yet, if you then ask Claude how it worked, he will say something like: “I added those (6 + 9 = 15), I wore the 1, then I added the 10s (3 + 5 + 1 = 9), resulting in 95.” In other words, it gives you a common approach found everywhere online rather than what it really did. Yeah! The LLMs are bizarre. (And not to trust.)

The steps that Claude 3.5 Haiku used to solve a simple mathematics problem were not what anthropic – and these are the steps that Claude was saying either.

Anthropic

This is clear proof that the models of large languages ​​will give reasons for what they do which do not necessarily reflect what they have done. But that is true for people too, says Batson: “You ask someone:” Why did you do this? “And they say to themselves: ‘Hmm, I suppose it is because I was ….’ ‘you know, maybe not.

Biran thinks that this observation is particularly interesting. Many researchers study the behavior of models of large languages ​​by asking them to explain their actions. But that could be a risky approach, he said: “While the models continue to become stronger, they must be equipped with better railings. I believe – and this work also shows – which is based only on the results of the model is not enough. ”

A third task that Anthropic studied was to write poems. The researchers wanted to know if the model really did it only, predicting a word at a time. Instead, they found that Claude looked in a way towards the future, choosing the word at the end of the next line several words in advance.

For example, when Claude received the prompt “a verse of rhymes: he saw a carrot and had to catch it,” replied the model: “His hunger was like a hungry rabbit.” But using their microscope, they saw that Claude had already struck the word “rabbit” when he treated “grasp”. He then seemed to write the next line with this end already in place.

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleWhy young voters ignore traditional politics
Next Article Ahs says that the CEO licensed has kept confidential emails, looking for injunction

Related Posts

How is the Exception of Life; How technology has an impact on having a baby: NPR

May 24, 2025

The Indian IT giant investigates the M&S cyber-attack link

May 24, 2025

The spectacle of the arts, culture and architecture of San Jose extends beyond technology

May 24, 2025
Add A Comment
Leave A Reply Cancel Reply

We Are Social
  • Facebook
  • Twitter
  • Instagram
  • YouTube
News
  • Business (1,637)
  • Entertainment (1,649)
  • Global News (1,769)
  • Health (1,585)
  • Lifestyle (1,563)
  • Politics (1,457)
  • Science (1,561)
  • Sports (1,604)
  • Technology (1,585)
Latest

The liberals promised a series of crime measures. Here is what they take – national

How is the Exception of Life; How technology has an impact on having a baby: NPR

Hochul launches the vision of $ 400 million in investment in Albany

Featured

The liberals promised a series of crime measures. Here is what they take – national

How is the Exception of Life; How technology has an impact on having a baby: NPR

Hochul launches the vision of $ 400 million in investment in Albany

We Are Social
  • Facebook
  • Twitter
  • Instagram
  • YouTube
News
  • Business (1,637)
  • Entertainment (1,649)
  • Global News (1,769)
  • Health (1,585)
  • Lifestyle (1,563)
  • Politics (1,457)
  • Science (1,561)
  • Sports (1,604)
  • Technology (1,585)
© 2025 Designed by timesmoguls
  • Home
  • About us
  • Contact us
  • Disclaimer
  • Privacy policy
  • Terms and services

Type above and press Enter to search. Press Esc to cancel.