541 million people visited DeepSeek’s website in May 2026. That number makes it the fifth most popular AI product on the planet, behind only ChatGPT, Google Gemini, Microsoft Copilot, and Perplexity. Twelve months earlier, most people in the U.S. had never heard the name.
What changed was a combination of frontier performance, radical pricing, and a geopolitical backdrop that turned a Chinese AI lab into the most debated company in the industry. DeepSeek charges roughly one ninth what OpenAI charges for comparable output quality. It trained its newest model on zero Nvidia hardware. And it just raised $7.4 billion in its first outside funding round, at a valuation that could reach $59 billion.
This guide covers what DeepSeek actually is, what its models can and cannot do, what it costs, and why the security and censorship questions around it are not hypothetical.
Who Built DeepSeek and Why
Liang Wenfeng founded DeepSeek in July 2023 as a spinoff from his quantitative hedge fund, Zhejiang High-Flyer Quantitative Investment Management. High-Flyer, which Liang founded in 2016, manages roughly $10 billion in assets and returned 56.6% in 2025 (AI Magazine, Top 100 AI Leaders 2026). The fund’s profits bankrolled DeepSeek’s early research without a single dollar of outside venture capital.
That changed in June 2026. Reuters reported that DeepSeek is raising approximately 50 billion yuan ($7.4 billion) in its first external round, with a post-money valuation between $48 billion and $59 billion (CNBC, June 3, 2026). Named investors include Tencent (roughly $1.4 billion), CATL ($700 million), and Liang himself ($2.9 billion of personal capital). NetEase, JD.com, IDG Capital, and several Chinese state-backed AI investment funds are also participating.
The round signals something specific: DeepSeek is no longer a side project funded by trading profits. It is building the infrastructure to compete with OpenAI, Anthropic, and Google at scale.
The Model Lineup: V4 on Huawei Silicon
DeepSeek’s current flagship is the V4 family, released in preview on April 24, 2026. Two models sit in the lineup:
V4-Pro carries 1.6 trillion total parameters with 49 billion active per token, using a Mixture of Experts (MoE) architecture. It supports a 1 million token context window and up to 384,000 tokens of output. It handles text, images, and video natively. It is licensed under MIT, the most permissive open-source license available.
V4-Flash is the lighter variant: 284 billion total parameters, 13 billion active per token, same 1 million token context window. It is designed for high-throughput, cost-sensitive workloads where V4-Pro’s reasoning depth is unnecessary.
The engineering story that matters most is the hardware. V4 was trained entirely on Huawei Ascend 950PR chips (Tom’s Hardware). A Huawei-led team post-trained the 1.6 trillion parameter model using approximately 1,000 Ascend 910C chips. This makes V4 the first frontier model engineered to train and serve with zero Nvidia dependency. For context, GLM-5.1 achieved something similar on Huawei hardware earlier in 2026, but DeepSeek’s V4-Pro is significantly larger and scores higher on most benchmarks.
V4 also subsumes the R1 reasoning line through an optional thinking mode, which means users no longer need to choose between the base model and the reasoning model. One API, one model, with reasoning depth toggled by a parameter.
What It Actually Costs
This is where DeepSeek forced the rest of the industry to react.
V4-Pro charges $0.435 per million input tokens on a cache miss and $0.0037 on a cache hit. Output runs $0.87 per million tokens. In May 2026, DeepSeek made a temporary 75% price cut permanent (InfoWorld), locking in prices that undercut every Western frontier model by a wide margin.
For comparison at 100 million output tokens per month:
- GPT-5.5: approximately $3,000
- Claude Opus 4.7: approximately $2,500
- DeepSeek V4-Pro: approximately $348
V4-Flash is even cheaper: $0.14 per million input tokens, $0.28 per million output tokens.
The pricing pressure is structural, not promotional. DeepSeek’s inference costs are lower because MoE architectures activate only a fraction of total parameters per token, and Huawei’s Ascend chips (while less capable per unit than Nvidia’s H100s) are available to DeepSeek without export-control restrictions. The company reported daily inference costs of $87,072 against estimated daily revenue of $562,027, according to Reuters, which suggests healthy unit economics even at these prices.
For enterprise buyers evaluating which AI model to route to, DeepSeek’s pricing makes it a serious option for high-volume, cost-sensitive workloads. The question is whether the performance and the security profile justify the savings.
Performance: Where It Wins, Where It Doesn’t
V4-Pro is competitive with GPT-5.5 and Claude Opus on most benchmarks but trails on agentic and complex reasoning tasks. The numbers tell a specific story (DataCamp comparison):
On long-context retrieval (MRCR 1M needle tasks), V4-Pro scores 83.5% versus GPT-5.5’s 74.0%. This is V4-Pro’s strongest differentiator: its sparse attention architecture (Compressed Sparse Attention plus Heavy Compressed Attention) achieves 4x KV cache compression, which translates to better recall at extreme context lengths.
On coding (SWE-bench Verified), V4-Pro scores 80.6%. Claude Opus 4.8 leads at 88.6%. GPT-5.5 scores comparably to V4-Pro.
On science and reasoning (GPQA Diamond), V4-Pro scores 90.1% versus GPT-5.5’s 93.6% and Gemini 3.1 Pro’s 94.3%.
On agentic tasks (Terminal-Bench 2.0), V4-Pro scores 67.9% versus GPT-5.5’s 82.7%. This is the widest gap. For workflows involving AI agents that chain multiple tool calls, GPT-5.5 and Claude Opus remain significantly ahead.
On Humanity’s Last Exam, the hardest general-reasoning benchmark, V4-Pro scores 37.7% versus Gemini 3.1 Pro’s 44.4% and GPT-5.5’s 41.4%.
The pattern: V4-Pro is a genuine frontier model at the knowledge and retrieval layer, priced at a fraction of its competitors. It is not yet a frontier model at the agent and complex-reasoning layer. For teams deciding when to reach for a different model, the benchmarks make the routing decision fairly clear.
The Censorship Problem
DeepSeek operates under China’s 2023 generative AI regulations, which require models to “uphold the core values of socialism” and not “damage the unity of the country.” In practice, this means DeepSeek refuses to answer roughly 85% of questions about politically sensitive topics: Tiananmen Square, Taiwan’s sovereignty, Uyghur internment camps, Xi Jinping criticism, and the Cultural Revolution (QWE AI Academy analysis).
This is not a bug or an oversight. It is a legal requirement for any AI model serving Chinese users, and DeepSeek complies. The censorship is built into the model’s training and reinforcement, not applied as a post-processing filter, which means it cannot be easily removed by self-hosting the open-weight version.
For enterprise buyers, the censorship question divides into two practical concerns. First, any application that might touch geopolitically sensitive content will produce refusals or misleading outputs. Second, the training-level censorship introduces unknown biases in areas adjacent to the filtered topics, since the model’s worldview is shaped by what it was and was not allowed to learn.
Security, Government Bans, and Military Links
The security concerns go beyond content filtering. A Reuters investigation cited a senior U.S. official claiming DeepSeek “willingly provided and will likely continue to provide support” to China’s People’s Liberation Army. Hidden code was reportedly found linking DeepSeek’s platform to a Chinese military-connected telecom.
The legislative response has been swift. Representatives Josh Gottheimer (D-NJ) and Darin LaHood (R-IL) introduced the bipartisan “No DeepSeek on Government Devices Act” (H.R. 1121), which would ban the platform from Senate and House devices. At least 17 U.S. states have already enacted their own restrictions. Internationally, Italy, Australia, Taiwan, South Korea, India, and the Czech Republic have banned or restricted DeepSeek on government systems (Computer Weekly).
U.S. enterprise adoption reflects these concerns. Ramp’s corporate adoption index showed DeepSeek at 0.3% in January 2025, declining to 0.1% by April 2026 and holding there. By contrast, Chinese state-owned enterprises have moved aggressively: Sinopec, PetroChina, China Southern Power Grid, and Dongfeng Motor Corp have all deployed DeepSeek in production.
The result is a model that dominates in China (541 million monthly visits, #1 among Chinese AI products) while remaining effectively locked out of Western government and regulated-industry use cases. For unregulated commercial applications, the decision comes down to whether the organization’s threat model can accommodate data flowing through Chinese infrastructure, or whether the self-hosted MIT-licensed weights resolve that concern.
What DeepSeek Changed About the Market
DeepSeek’s impact extends beyond its own user base. Three structural shifts trace directly to its emergence.
Pricing collapse. Before DeepSeek published R1’s training costs in January 2025, the assumption was that frontier AI required billions in compute. DeepSeek trained R1 for approximately $5.6 million. V4, despite being far larger, maintained cost efficiency through MoE sparsity and Huawei silicon. OpenAI, Anthropic, and Google have all cut API prices multiple times since R1’s release, and the broader compute economics conversation has shifted permanently.
Export control rethinking. DeepSeek is the strongest evidence that U.S. chip export controls are driving Chinese innovation rather than blocking it. The Brookings Institution published a detailed analysis arguing that restrictions on Nvidia H100 sales pushed DeepSeek to develop novel training techniques that worked on constrained hardware, ultimately producing a more efficient architecture (Brookings, 2026). When V4 moved entirely to Huawei Ascend chips, the export-control leverage point disappeared entirely.
Open-weight legitimacy. DeepSeek’s decision to release V4 under the MIT license, combined with competitive benchmark scores, validated the open-source AI model as a serious frontier contender. Meta’s Llama, Google’s Gemma 4, and DeepSeek V4 now form a credible open-weight tier that competes with closed models on most tasks. For enterprises that need to run models on their own infrastructure for compliance or latency reasons, DeepSeek’s MIT license is the most permissive option available.
Who Should Use It (and Who Shouldn’t)
DeepSeek V4-Pro is a strong choice for developers and enterprises that need high-volume inference at the lowest possible cost, particularly for long-context retrieval, multilingual tasks, and knowledge-intensive applications. The MIT license makes it viable for on-premises deployment, and the 1 million token context window is genuinely useful for document analysis and code review workloads.
It is not the right choice for applications requiring complex agentic reasoning (where GPT-5.5 and Claude Opus lead by 15+ points), for any use case touching geopolitically sensitive content, for organizations in regulated industries with strict data sovereignty requirements tied to Western jurisdictions, or for any government or defense application in countries that have enacted bans.
The $7.4 billion funding round means DeepSeek is not going away. With 541 million monthly visits, MIT-licensed weights, Nvidia-free training infrastructure, and pricing that forces every competitor to respond, it has permanently altered the economics and geopolitics of the AI industry. The question for every buyer is no longer whether DeepSeek matters. It is whether the trade-offs fit their specific use case.
