Chinese AI Models Have Lagged the Us Frontier by 7 Months on Average Since 2023
Key topics
The AI gap between the US and China has shrunk to just 7 months since 2023, sparking debate on whether China's rapid progress is largely due to distilling frontier models through APIs. Some commenters argue that China is innovating, but not necessarily in training large models, while others point out that constraints like the Nvidia chip export ban have driven China to optimize training practices. The discussion highlights the cat-and-mouse game between the US and China, with both sides using export controls and legislation as levers of power. As one commenter noted, constraints can breed improvements, and China's progress is a testament to this phenomenon.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
4m
Peak period
31
1-2h
Avg / period
8.9
Based on 80 loaded comments
Key moments
- 01Story posted
Jan 8, 2026 at 12:40 PM EST
1d ago
Step 01 - 02First comment
Jan 8, 2026 at 12:43 PM EST
4m after posting
Step 02 - 03Peak activity
31 comments in 1-2h
Hottest window of the conversation
Step 03 - 04Latest activity
Jan 9, 2026 at 12:49 AM EST
1d ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
They use H800 as opposed to major US ones on H100 (2-3x faster)
What doesn't kill you really does make you stronger.
For American and other non-PRC companies thinking of using Chinese models, doesn't this have to be balanced with the risk that the US or its leadership may kneecap the Chinese models through export controls, an executive order, or some other means?
And the Chinese have been a huge source of innovation in the field.
They have been a source of innovation but probably not in training them.
It was much easier when companies had models on the /completion style APIs, because you could actually get the logits for each generation step, and use that as a dataset to fit your model to.
That isn't to diminish the efforts of the Chinese developers though, they are great.
My intuition that one need ALOT api credits to distill such large models.
And I guess the idea is is that there is this extreme inflection point in utility somewhere that makes it so getting there first gives you some incredible economic edge.
It might not exist though. Like either utility plateaus and its bubble crash of the century time or it just keeps going up but without any specific point where you can differentiate something.
What yes, it's clear by now it's way beyond the capacity of those AIs, and the odds are pretty good it's impossible to a large extent (but some limited version of it may be possible).
We "distilled" modern cars from Model-T. You still driving the car that was "first" off an assembly line?
This is normal improvement to manufacture of stuff. Your handwavy "it was first so its winner-winner chicken dinner!" is little more than your personal form of expression.
Since LLMs are a distill of web content, by your poorly defined metric, you must not value LLMs and AI companies? The content already existed! They're just a new indexing tool.
Each wizard school also seems to take a different approach and have different goals. Soon people will benchmark lawyers with Lego.
The machines these models run on are well known. They’re not black boxes. The results will be same-y despite the software engineers awareness the timeline, process, companies took to get there is different.
UPS trucks may carry different sizes and shapes of packages day to day but their upper bounds on total weight and geometry exist too.
A Honda and Ford can look different, but physical reality, whether the measure is user feedback (human biology exists is physical) or physics itself, still results in very same-y 4 wheels, etc etc.
What's strange to me is all the software engineers who ignore physics. All of our applied knowledge that gives rise to software engineering also constrains the outcomes. Our ability to sit down every day and arbitrarily slice up data in various ways is very much constrained by physics like everything else.
The easy money/endless hype era of ZIRP where SWEs failed up thanks to endless employment opportunities has resulted in way too many SWEs believing their efforts on some trivial shit like a JS framework, or some CSS designs is propelling humans into the future.
Nah, it's just physics as usual. You alls sensory memory is just parroting the crap it memorized.
Doesn't matter: if they're good enough and cheaper, they'll sink the US model-makers eventually. The free market demands it.
The US invented solar panels, and lead in solar panel tech for a long time. Who leads in solar panel tech now?
China has a playbook for de-industrializing its capitalist rivals. If we leave MBAs and free-marketers in power, China will "come to dominate all technologies, including A.I., and ... America [will] export little more than soybeans and corn" (https://www.nytimes.com/2025/12/17/opinion/trump-ai-chips-nv...).
This kinda sounds like you're talking about Trump, but I think the problem predates him and is far deeper. If anything, Trump is a spastic reaction to the deeper problem. He won because his rhetoric gestured in the direction of fixing the problem, but he's too incompetent to pull it off (and the bulk of the competent people don't want to fix the problem for ideological reasons).
This is how you go from stability to world wars. A couple of rich guys got together and decided they were going to redraw all of the maps and toss the rulebook overboard, and it is for the most part going their way. People are being executed and the useful idiots are falling over each other to defend it.
If you had told me in 1999 that this would happen by 2026 I would have happily declared you mad, but here we are.
It's way deeper than that, though. It's stuff like US businessmen choosing to literally move the US's rare-earth magnet production capacity to China, teaching China how make them in the process (https://www.nytimes.com/2025/12/31/business/china-rare-earth...). It's the US knowing about it's rare-earth vulnerability for a decade or more but being completely unable to do anything about it. It's the US losing other strategic capabilities like large-scale electronics manufacturing capacity, and people being totally fine with that because "cheaper iPhones, more margins, good deal!"
But the singular focus on destruction of what is a cornerstone of the stability of the Western hemisphere is absolutely unprecedented. And to see so many people falling for it, hook line and sinker. They are so blasted with crazy things that they no longer see anything strange at all about each and every day's happenings and they even jump to defend the absolutely indefensible, something that probably would have been - rightly - horrified by less than a decade ago is now perfectly normal.
No, because it's not a problem of economic development, but political ideology.
China's political priority is technological dominance and capability, and it views free markets as a tool subordinate to those goals. The US's political priority is financial wealth, and an extreme ideological attachment to free markets that overrides most other priorities. The US has an ideological vulnerability that China is well-positioned to exploit.
This problem goes well beyond Trump, and has roots that are very deep.
Lest you forget: China is controlled by the CCP and is not a democracy. It will not affect political priorities if "more Chinese become middle class or wealthy" and "view things differently." The Chinese political system does not answer to them, and will only throw them a bone if there's a major threat to stability.
You're echoing the 90s-era hope that free markets would bring political liberalization to China, but history has debunked that idea.
For the size/performance yes.
> In any case, they wouldn't exist if not for superior models they were distilled from.
So? Those models wouldn't exist without the sum total of human knowledge. As long as a work is transformative why does it matter?
Measured by the DCI the Chinese AI models are about 1.5 years ahead of US models.
DCI = Dust42 Capability Index: MBP Max 64GB, Qwen3-80B MLX 4bit quant, 40 tokens per second. It is not on Claude Opus level but very, very useful if you have no internet, i.e. on a flight. And occasionally it surpasses even Opus by far and large. Opus is a pain in the neck once the coding task at hand surpasses its capabilities. Qwen3 is much better to guide to get step by step to a solution.
My theory is that these models serve the purpose of being relatively easy to run/tweak for researchers, and mainly serve to demonstrate the effectiveness of new techniques in training and inference, as well as the strength of AI labs that created them.
They are not designed to be state of the art commercial models.
By choosing bigger model sizes, running more training epochs, and drilling the models a bit more on benchmarking questions, I'm sure the Chinese could close the gap, but that would delay these models, make them more expensive and harder to run without showing any tangible research benefit.
Also my 2c: I was perfectly happy with Sonnet 3.7 as of a year ago, if the Chinese have a model really as good as that (not only one that benchmarks as well), I'd definitely like to try it.
GLM-4.7 like a mix of Sonnet 4.5 and GPT-5 (the first version not the later ones). It has deep deep knowledge, but it's often just not as good in execution.
They're very cheap to try out, so you should see how your mileage varies.
Ofcourse for the hardest possible tasks that GPT 5.2 only approaches, they're not up to scratch. And for the hard-ish tasks in C++ for example that Opus 4.5 tackles Minimax feels closer, but just doesn't "grok" the problem space good enough.
Adding behind after lag as a verb is more of a "because it sounds good", perhaps as a subconscious way to emphasize the verb, but it isn't a grammatical requirement at all.
Leaving it off is almost certainly more to keep the headline short than anything else.
Note also that these aren't really questions of grammar (syntax) but meaning (semantics). Does "lagged" mean the same thing as "trailed" in this kind of construction? It didn't some decades ago, but maybe it does today. Or will tomorrow.
For me, there are three idiomatic forms:
1. Using "lag behind" gives a target/reference as a prepositional relationship, not as an object of the verb "to lag".
2. Using "caused to lag" allows one to specify a causal agent, but again not as an object of the verb "to lag".
3. Using "lag" alone is a subject-verb construct, leaving an implicit target/reference from context expectations. A coach or supervisor might scold someone for lagging.
As a bit of a tangent, I actually wonder if the etymology of "to lag" is more Germanic than some people assume. The verb lagern has many uses for placing, storing, and leaving behind. It's where our English concept of a "lager" beer comes from too, referencing the way the beer is fermented in (cold) storage. If this linguistic connection remained fresh, we might think of an SVO construct of lagging as the opposite of the intent in this article. The leader would lag the follower by leaving them behind!
It's thieves all the way down.
All frontier US models are closed weight. It's great what Chinese are doing because open weights help everyone. Also there is a lot of research thanks to these open weights, look how much research is being done using Qwen models in US (Microsoft etc) and in the rest of the world.
Have you seen Manifold Constrained Hyper Connections (mHC) paper from a few days ago from Deepseek? Projects residual connection space onto a constrained manifold to keep identity mapping properties while enabling richer internal connectivity, so basically it eliminates a huge problem.
They also released A LOT of training tricks and innovation around optimizing inference and training.
As to other industries:
"China leads research in 90% of crucial technologies — a dramatic shift this century" [1]
And here's[2] "China Is Rapidly Becoming a Leading Innovator in Advanced Industries", a big report on where they lead and how.
1. https://www.nature.com/articles/d41586-025-04048-7
2. https://itif.org/publications/2024/09/16/china-is-rapidly-be...