DeepSeek: Domestic Chip Adaptation

Advertisements

As the DeepSeek phenomenon sweeps through the tech landscape, domestic GPU manufacturers are diving headlong into this adaptation waveHowever, despite the apparent similarity in moves, each company has distinct advantages and strategies that set them apart.

Today, industry reports often focus on the sheer number of companies adapting to DeepSeekHowever, there is a notable lack of in-depth exploration of the differences among these companiesIs it a divergence in technological routes, varying performance levels, diverse ecosystem developments, or differing application scenarios that leads to these distinctions?

Choosing Between Original and Distilled Models

When it comes to adapting DeepSeek models, the actions of chip manufacturers can generally be categorized into two groupsOne focuses on adapting the original R1 and V3 models, while the other targets the lighter distilled versions derived from R1.

The distinctions among these three models are significant:

DeepSeek R1 is positioned as an inference-priority model, designed for scenarios requiring deep logical analysis and problem-solvingIt excels across several tasks, including mathematics, programming, and reasoning.

Conversely, DeepSeek V3 is a general-purpose large language model that supports efficient and flexible applications across a variety of natural language processing tasks, catering to the needs of multiple fields

Advertisements

The original R1 and V3 models usually possess a larger parameter count, resulting in a more complex structure.

The DeepSeek-R1 series of distilled models offers a lightweight version with fewer parameters, intended to maintain a certain level of performance while reducing resource consumptionThis makes it suitable for lightweight deployments and scenarios with limited resources, such as edge device inference and rapid AI application validation for small to medium enterprises.

Even though manufacturers are racing to adapt to DeepSeek, the types of models they are adapting differ greatly.

While mainstream GPU vendors are accelerating the adaptation of DeepSeek models, only about half have explicitly announced their support for the original R1 and V3 modelsThese models have extremely high requirements regarding chip computing power, memory bandwidth, and multi-card interconnect technologyCompanies like Huawei Ascend and Haiguang Information fall into this category.

The other portion of manufacturers primarily supports the DeepSeek-R1 series of distilled models (with parameter specifications ranging from 1.5 billion to 8 billion). Since the original models are based on Tongyi Qianwen and LLAMA, platforms that can support Tongyi Qianwen and LLAMA models can generally adapt to these distilled DeepSeek models with minimal extra effortCompanies like Moore Threads and Birun Tech are examples of this group.

Different sizes of models are suited for varying scenarios; cloud-side inference requires larger model parameters and optimal performance, primarily adapting to the original R1 or V3 models; edge-side chips typically accommodate models in the 1.5B to 8B range, which are mature enough that no substantial extra work is needed.

Company Advantages: What Sets Them Apart?

In addition to the differences in model types, companies have adopted various technological routes, resulting in distinct challenges during adaptation.

Firstly, considering the current technological ecosystem and practical application scenarios, running and adapting DeepSeek models primarily rely on Nvidia's hardware and programming language

Advertisements

As such, each manufacturer's adaptability is contingent upon their compatibility with the original development ecosystem.

This means that, at present, DeepSeek mainly adapts to Nvidia chips, which influences the application and performance of other hardware platformsThe ease of adapting large models like DeepSeek, developed based on Nvidia GPUs, is related to whether the chips are compatible with CUDAManufacturers compatible with CUDA have varying degrees of interoperability.

Secondly, performance varies across different GPUs in terms of computational capacity (such as FLOPS and memory bandwidth), directly affecting the speed at which DeepSeek can handle large-scale deep learning tasksSome GPUs may demonstrate superior efficiency ratios, making them suitable for running DeepSeek in low-power environments.

Commercial Applications of DeepSeek

The commercial deployment of DeepSeek can take various forms:

Cloud Deployment:

For instance, DeepSeek models can provide services through the Huawei Cloud platform, allowing enterprise customers to utilize DeepSeek's capabilities such as image recognition, natural language processing, and speech recognition via API calls or cloud servicesCompanies pay based on actual usage (like computing resources or the number of API calls), reducing initial investment costsThis cloud service model eliminates the need for enterprises to deploy hardware on-site, allowing for quick implementation and application.

On-Premise Deployment:

There are also integrated machine forms: currently, DeepSeek large model integrated machines are categorized into inference machines and training-inference machines

Advertisements

The DeepSeek inference integrated machine is equipped with different size models like DeepSeek-R1 32B, 70B, and full-version 671B, priced from hundreds of thousands to several million yuan, targeting companies sensitive to data security and privacyThe training-integrated machines, which are even more expensive, can reach millions in price, particularly for those designed for pre-training and fine-tuning the DeepSeek-R1 32B model.

Enterprises can also deploy solutions themselves: for those with extremely high performance requirements (such as autonomous driving or financial risk control) or those demanding high security (like government and financial institutions), the DeepSeek model can be deployed locally on hardware like GPU chips to achieve maximum performance.

Currently, the commercial model shows that due to high costs associated with deploying GPU chips and DeepSeek models on-premises, enterprise users will initially test on public clouds to ascertain compatibility with their needs, eventually considering private cloud deployments or integrated machine formsTherefore, small and medium-sized enterprises might lean more towards utilizing related technologies through cloud services.

Of course, some enterprises that prioritize data security or urgently require high performance capabilities are investing tens of thousands or even millions to deploy integrated machines that meet their requirementsAs the development of DeepSeek's open-source model progresses, the demand for privatized deployments has increasingly emerged, leading to a burgeoning market for integrated machines and attracting numerous companies to engage.

Who is Excelling in Commercializing DeepSeek?

In terms of DeepSeek's concept, both Ascend and Haiguang have made significant strides toward commercialization.

Integrated machines are in high demand, benefiting Ascend:

Around 70% of businesses are expected to adopt DeepSeek based on Ascend technology.

Recently, companies such as Huakun Zhiyu, Baode, Shenzhou Kuntai, and Yangtze Computing have released DeepSeek integrated machines, all built on Ascend products.

Notably, as the frequency of DeepSeek integrated machine releases increases, the industrial alliance surrounding Ascend continues to broaden.

Reports indicate that more than 80 companies have rapidly adapted or launched DeepSeek series models based on Ascend technology, providing external services

An additional 20 or more companies are expected to go live in the next two weeksThis signifies that about 70% of Chinese enterprises are aligning themselves with Ascend to adopt DeepSeek.

Compared to imported GPU solutions, the localized services and teams associated with Ascend chips significantly influence the deployment outcomes of DeepSeekFor instance, in a data center with thousands of cards, the automatic parallelism feature of the MindSpore toolchain reduces the amount of distributed training code by 70%.

Haiguang: Penetrating diverse scenarios including intelligent computing centers and finance:

Haiguang's collaboration with DeepSeek covers critical scenarios such as intelligent computing centers, finance, and smart manufacturing.

In the realm of intelligent computing centers, Haiguang Information has partnered with QCloud Technology to launch the “Haiguang DCU + Base Stone Intelligent Computing + DeepSeek Model” solution, supporting flexible billing based on tokens to lower the entry barriers for enterprise AI applications.

In financial technology, Zhongke Jincai has collaborated with Haiguang Information Technology Co., Ltd. to launch an integrated soft and hardware solution that combines self-developed multi-scenario models with the Haiguang DCU acceleration cards and achieves in-depth adaptation with DeepSeek models.

In smart manufacturing, the Haiguang DCU empowers industrial visual inspection and automated decision-making by adapting the DeepSeek-Janus-Pro multimodal model, assisting companies like SANY Heavy Industry in achieving intelligent upgrades on production lines.

In data management, the smart data management platform developed by Kongtian will fully adapt to Haiguang DCU, embedding DeepSeek into the platform as a “super engine” to support data processing in fields like natural resources, energy, and aerospace.

Moreover, JD Cloud has also released a DeepSeek large model integrated machine that supports domestic AI acceleration chips like Huawei Ascend and Haiguang.

Opportunities for Domestic GPUs are Arising

With the rollout and widespread application of integrated machines like DeepSeek, the market demand for domestic chips is significantly increasing.

Yang Jian, CTO of Muxi Technology, noted that many non-Nvidia cards are expected to join the large model post-training section this year

He believes that the privatization of large models like DeepSeek presents an opportunity for domestic chips.

“The opportunity for domestic GPUs in 2025 lies in privatized deployment, primarily focusing on post-training and inference of large models,” Yang notedHe explained that Nvidia GPUs, while they have become prevalent in the AI sector, are becoming scarce in retail markets, and privatized deployments rely significantly on the retail sectorShould the privatized deployment market erupt, domestic cards could see considerable opportunity.

As challenges arising from overseas chip computing restrictions become closer, global computing capabilities may evolve along two parallel paths, gradually decouplingBy 2026-2027, the strong GPU base for pre-training and post-training is expected to remain Nvidia in the US, while a portion in China will be handled by Nvidia and another portion by domestic chipsThe post-training section will likely welcome more non-Nvidia cards this year, as its requirements for clusters are relatively lower and do not require thousands of cards.

Individuals involved with Tianshui Zhixin indicate that as breakthroughs emerge in domestic models, the demand for compatibility with domestic chips is increasing, presenting substantial growth opportunities this year.

The surge of interest surrounding the DeepSeek model also implies opportunities for the explosion of AI applications, steering chip manufacturers toward the necessary inference computing capabilities for AILast year, Chinese chip evaluations primarily focused on training, viewing domestic chips as alternatives to Nvidia in training situations

Advertisements

Advertisements

Leave a Reply

Your email address will not be published. Required fields are marked *