Getty ImagesTumbling stock market values and wild claims have accompanied the release of a new AI chatbot by a small Chinese company. What makes it so different?
The reason behind this tumult? The “large language model” (LLM) that powers the app has reasoning capabilities that are comparable to US models such as OpenAI’s o1, but reportedly requires a fraction of the cost to train and run.
Analysis
Dr Andrew Duncan is the director of science and innovation fundamental AI at the Alan Turing Institute in London, UK.
DeepSeek claims to have achieved this by deploying several technical strategies that reduced both the amount of computation time required to train its model (called R1) and the amount of memory needed to store it. The reduction of these overheads resulted in a dramatic cutting of cost, says DeepSeek. R1’s base model V3 reportedly required 2.788 million hours to train (running across many graphical processing units – GPUs –…


