Home Artificial Intelligence DeepSeek

DeepSeek

6
0
Deepseek R1 Model Launch
Source: ddg

On January 15, 2025, Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd., also known as DeepSeek, launched its eponymous chatbot alongside the DeepSeek-R1 model, marking a significant milestone in the development of large language models (LLMs). DeepSeek, a Chinese artificial intelligence company, was founded in July 2023 by Liang Wenfeng, who also serves as the CEO of the company and its parent organization, High-Flyer, a Chinese hedge fund.

DeepSeek’s launch of the DeepSeek-R1 model has been notable for its ability to provide responses comparable to other contemporary large language models, such as OpenAI’s GPT-4 and o1. This achievement is particularly significant given the company’s relatively short history and the fact that it is a newer player in the AI industry. The success of DeepSeek-R1 has been described as “upending AI,” suggesting that the company’s innovative approach to developing LLMs has the potential to disrupt the status quo in the industry.

The training cost of DeepSeek’s models has been reported to be significantly lower than that of other LLMs. Specifically, the company claims that it trained its V3 model for US$6 million, which is far less than the US$100 million cost for OpenAI’s GPT-4 in 2023. Additionally, DeepSeek’s model was trained using approximately one-tenth the computing power consumed by Meta’s comparable model, Llama 3.1. This reduced training cost and computing power consumption could have significant implications for the development and deployment of LLMs in the future.

DeepSeek’s models are described as “open-weight,” meaning that the exact parameters are openly shared, but the training data is not openly licensed. This approach allows other developers to build upon and modify DeepSeek’s models, while also protecting the company’s proprietary training data. The open-weight approach could facilitate collaboration and innovation in the AI industry, as developers can leverage DeepSeek’s models to create new applications and services.

The success of DeepSeek has significant implications for the AI industry as a whole. As a newer player in the industry, DeepSeek’s ability to develop competitive LLMs at a lower cost and with reduced computing power consumption could challenge the dominance of more established companies. Additionally, the open-weight approach adopted by DeepSeek could lead to increased collaboration and innovation in the industry, as developers can build upon and modify the company’s models. However, it remains to be seen how DeepSeek’s models will be received by the market and how they will be used in practical applications.

Looking ahead, it will be important to watch how DeepSeek continues to develop and deploy its LLMs. The company’s ability to innovate and improve its models will be crucial in determining its long-term success in the AI industry. Additionally, the impact of DeepSeek’s open-weight approach on the industry as a whole will be worth monitoring, as it could have significant implications for collaboration and innovation in the development of LLMs. As the AI industry continues to evolve, companies like DeepSeek are likely to play an increasingly important role in shaping the future of artificial intelligence.