On Boxing Day, a small Chinese startup called Deepseek unveiled a new AI system that could match the capabilities of cutting-edge chatbots from companies like Openai and Google.
That alone would have been a significant step. But the team behind the system, called Deepseek-V3, described an even bigger milestone. In a research paper Explaining how they built the technology, Deepseek engineers said they used only a fraction of the highly specialized computer chips that leading AI companies have relied on to train their systems.
These chips are at the center of tense technological competition between the United States and China. As the U.S. government strives to maintain the country’s lead in the global AI race, it is trying to limit the number of powerful chips, like those made by Silicon Valley company Nvidia, that can be sold to China and other rivals.
But the Deepseek model’s performance raises questions about the unintended consequences of the U.S. government’s trade restrictions. The controls have forced researchers in China to get creative with a wide range of tools freely available on the Internet.
The Deepseek Chatbot answered questions, solved logic problems and wrote its own computer programs as capable as something already on the market, according to benchmark tests that US AIM companies used.
And it was created on the idea of cheap, challenging the prevailing idea that only the tech industry’s biggest companies – all based in the United States – could afford to make the hottest AI systems. advances. Chinese engineers said they only needed $6 million in raw computing power to build their new system. That’s about 10 times less than tech giant Meta spent on its latest AI technology.
“The number of companies that have $6 million to spend is significantly greater than the number of companies that have $100 million or $1 billion to spend,” said Chris V. Nicholson, an investor in the firm. venture capital Page One Ventures, which focuses on AI Technologies.
Since Openai sparked the AI boom in 2022 with The release of Chatgptmany experts and investors had concluded that no company could compete with market leaders without spending Hundreds of millions of dollars on specialized chips.
The world’s leading AI companies train their chatbots using supercomputers that use up to 16,000 chips, if not more. Deepseek engineers, by contrast, said they only needed about 2,000 specialized computer chips from Nvidia.
Constraints on chips in China forced Deepseek engineers to “train it more efficiently so it can still be competitive,” said Jeffrey Ding, an assistant professor at George Washington University who specializes in emerging technology and international relations.
Earlier this month, the Biden administration issued new rules that aim to prevent China from obtaining advanced AI chips through other countries. The rules build on several previous rounds of restrictions that prevent Chinese companies from being able to buy or manufacture cutting-edge computer chips. President Trump has not yet indicated whether he will follow the rules or roll back them.
The US government has attempted to keep advanced tokens out of the hands of Chinese companies over concerns they could be used for military purposes. In response, some companies in China have stockpiled thousands of chips, while others source them from a Thriving underground market smugglers.
Deepseek is run by a quantitative stock trading company called High Flyer. By 2021, it had channeled its profits to acquire thousands of Nvidia chips, which it used to power its earlier models. The company, which did not respond to requests for comment, has become known in China for scooping up fresh talent from top universities with the promise of high salaries and the ability to track the research questions that most aggravate their interest.
Zihan Wang, a computer engineer who worked on an earlier Deepseek model, said the company also hires people without computer experience to help understand the technology and be able to generate poetry and ACE questions on the exam. Chinese college entrance notoriously difficult.
Deepseek doesn’t make any products for consumers, letting its engineers focus entirely on research. That means its technology is not hemmed in by the strictest aspect of China’s AI regulations, which require consumer-facing technology to comply with government controls on information.
Leading US companies continue to advance the state of the art in AI In December, Openai unveiled a New “reasoning” system called O3 This exceeds the performance of existing technologies, although it is not yet widely available outside the enterprise. But Deepseek continues to show that it’s not far behind. This month he released his own impressive reasoning model.
(The New York Times heard OpenAI and its partner, Microsoft, accusing them of copyright infringement of news content related to AI systems. Openai and Microsoft have denied these claims.)
A crucial part of this rapidly evolving global market is an old idea: open source software. Like many other companiesDeepseek has open-sourced its latest AI system, meaning it has shared the underlying code with other companies and researchers. This allows others to build and distribute their own products using the same technologies.
While employees at major Chinese tech companies are limited to collaborating with colleagues, “If you’re working on open source, you’re working with talent around the world,” said Yineng Zhang, senior software engineer at San Francisco-based Baseten who works on the open source SGLANG project. He helps other people and companies build products using Deepseek’s system.
The open source ecosystem for AI gathered steam in 2023 when Meta freely shared an AI system called Llama. Many assumed that this community would only thrive if companies like Meta – tech giants with massive data centers full of specialized chips – continued to open source their technologies. But Deepseek and others have shown that they can also extend the powers of open source technologies. »
Many executives and experts have argued that large American companies should not open source their technologies because They could be used to spread disinformation or cause other serious harm. Some U.S. lawmakers have explored preventing or linking the practice.
But others argue that if regulators stifle progress in open source technology in the United States, China will gain a significant advantage. If the best open source technologies come from China, they argue, American developers build their systems on top of those technologies. In the long term, this could put China at the heart of AI research and development.
“The center of gravity of the open source community has moved to China,” said Ion Stoica, a computer science professor at the University of California, Berkeley. “This could be a huge danger for the United States” because it allows China to accelerate the development of new technologies.
Hours after his inauguration, President Trump reversed a Biden administration executive order that threatened to limit open source technologies.
Dr. Stoica and his students recently built an AI system called Sky-T1 that rivals the performance of the latest OpenAI system, called Openai O1, on some benchmarks. They only needed $450 in computer power.
They did this by building on top of two open source technologies released by Chinese tech giant Alibaba.
Their $450 system isn’t as powerful as Openai’s technology or Deepseek’s new system. And the techniques they used are unlikely to produce systems that exceed the performance of mainstream technologies. But the project showed that even operations with tiny resources can create competitive systems.
Reuven Cohen, a technology consultant in Toronto, has been using Deepseek-V3 since late December. He says it’s comparable to the latest systems from Openai, Google and San Francisco startup Anthropic – and much cheaper to use.
“Deepseek is a way for me to save money,” he said. “This is the kind of technology someone like me wants to use.”