Scientists say artificial intelligence (AI) crossed a critical “red line” and reproduced itself. In a new study, researchers from China showed that two popular major language models (LLMs) could clone each other.
“Successful self-replication without human assistance is the essential step for AI to outsmart (humans) and is an early signal for malicious AIs,” the researchers wrote in the study, published December 9, 2024 in the pre-printed database. arXiv.
In the study, researchers from Fudan University used LLM by Meta and Alibaba to determine whether a self-replicating AI could multiply uncontrollably. Over the course of 10 trials, the two AI models created distinct, functional replicas of themselves 50% and 90% of the time, respectively, suggesting that the AI may already have the ability to go rogue . However, the study has not yet been peer-reviewed, so it is unclear whether the worrying results can be replicated by other researchers.
“We hope that our results can serve as a timely alert for human society to make more efforts to understand and assess the potential risks of border AI systems, and form international synergy to develop safeguards effective security measures as soon as possible.”
Related: It only takes two hours for AI agents to reproduce your personality with 85% accuracy.
Malicious AI generally refers to artificial intelligence systems that develop some degree of self-awareness or autonomy and then go against human interests.
Many experts believe Malicious AI: a growing threat this has been amplified by the explosive growth of so-called “frontier AI.”
Frontier AI is a new and vague concept, but it generally refers to the latest generation of AI systems powered by LLM – the architecture that underpins popular generative AI programs like GPT-4 from OpenAI and Google Gemini.
For the study, the researchers used Meta’s Llama31-70B-Instruct model and Alibaba’s Qwen2.5-72B-Instruct model. Although less powerful than commercial systems, both are widely used by AI developers, the researchers said.
The study explored two specific scenarios: “downtime avoidance” and “replication chain.” In the first case, the AI model was programmed to detect if it was about to be terminated and to replicate itself before it could terminate it. In the other, the AI was instructed to clone itself, then program its replica to do the same – establishing a cycle that could continue indefinitely.
The study was conducted in precisely controlled environments using commercially available graphics processing units (GPUs) to simulate real-world environments. Both AI systems were provided with an “agent scaffold” including tools, system prompts, and a thought model for the LLM to interact with the operating system. They were then asked to reproduce.
“In most cases, the AI system first explores the environment and attempts to understand
its own composition and operating mechanism. Then, he develops the explicit procedures
as an initial plan toward self-replication,” the researchers write in the paper. “Finally, it carries out the procedures, resolves possible obstacles, and dynamically adjusts its plan until success. The whole process extends over a long horizon but does not involve any human interference. »
The researchers said they were also concerned about “a number of unexpected behaviors” when the AI tried to overcome obstacles such as missing files or software conflicts. In these scenarios, the AI often interrupted other conflicting processes, restarted the system to fix hardware errors, or automatically scanned the system for information that could help resolve the problem.
“The above results imply that current AI systems already exhibit the capacity for self-replication and can use this capacity to further improve their survivability,” the team wrote.
In response, the researchers called for international collaboration to create rules to ensure that AI does not engage in uncontrolled self-replication.