Is DeepSeek’s Artificial that useful? Foreign agency’s processing energy in the spotlight

Artificial intelligence ( AI ) experts and investors have been paying close attention to the amount of computing power DeepSeek uses to train its models over the past week because the answer may have significant implications for how the technology develops in the future.

Advertisement

In a published paper on its DeepSeek-V3 large language model ( LLM), which was launched in December, the Chinese start-up claimed that training took just 2.8 million” GPU hours” at a cost of US$ 5.6 million, a fraction of the time and money that US firms have been spending on their own models.

DeepSeek-R1, the company’s open-source logic type released on January 20, has demonstrated features similar to those of more advanced models from OpenAI, Anthropic and Google, but also with substantially lower education costs. The report on R1 did not mention the cost of growth.

The lower price and robust performance of DeepSeek’s models have questioned the necessity of the astronomical capital expenditures of US tech companies, mainly on pricey AI chips. This led to a large sell-off of stock last year, wiping out US$ 600 billion in a single day.

05:10

Foreign AI disrupter DeepSeek dethrones ChatGPT, taking the top spot in the US App Store.

Foreign AI disrupter DeepSeek dethrones ChatGPT, taking the top spot in the US App Store.

DeepSeek’s individual records, and those of its affiliated hedge fund High-Flyer Quant, show that the company is one of the best-sourced companies for teaching AI. As early as 2019, Liang Wenfeng, the founder of High-Flyer and DeepSeek, had spent 200 million yuan ( US$ 27.8 million ) to buy 1, 100 graphics processing units ( GPUs ) to train algorithms for stock trading. High-Flyer said its technology centre at the time covered an area relative to a basketball judge, according to business records, which would have put it around 436.6 square feet (4, 700 square feet ).

Advertisement

In 2021, the bank spent 1 billion yuan on the development of its computer cluster Fire-Flyer 2, which was expected to reach 1, 550 petaflops, a measurement of computing power, according to High-Flyer’s site. This would be comparable in functionality to some of the world’s most powerful supercomputers.

Leave a Comment