Deepmind and Openai Win Gold at Icpc
Posted4 months agoActive3 months ago
codeforces.comTechstoryHigh profile
heatedmixed
Debate
80/100
Artificial IntelligenceCompetitive ProgrammingLlms
Key topics
Artificial Intelligence
Competitive Programming
Llms
DeepMind and OpenAI's AI models achieved gold medal performance at the International Collegiate Programming Contest (ICPC), sparking debate about the significance and implications of this achievement.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
14m
Peak period
153
Day 1
Avg / period
32
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Sep 17, 2025 at 2:15 PM EDT
4 months ago
Step 01 - 02First comment
Sep 17, 2025 at 2:29 PM EDT
14m after posting
Step 02 - 03Peak activity
153 comments in Day 1
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 29, 2025 at 11:03 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45279357Type: storyLast synced: 11/20/2025, 8:28:07 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Copying from a comment I made a few weeks ago:
> I dunno I can see an argument that something like IMO word problems are categorically a different language space than a corpus of historiography. For one, even when expressed in English language math is still highly, highly structured. Definitions of terms are totally unambiguous, logical tautologies can be expressed using only a few tokens, etc. etc. It's incredibly impressive that these rich structures can be learned by such a flexible model class, but it definitely seems closer (to me) to excelling at chess or other structured game, versus something as ambiguous as synthesis of historical narratives.
edit: oh small world! the cited comment was actually a response to you in that other thread :D
That's hilarious, we must have the same interests since we keep cross posting :D
The thing with the go comparison is that alphago was meant to solve go and nothing else. It couldn't do chess with the same weights.
The current SotA LLMs are "unreasonably good" at a LOT of tasks, while being trained with a very "simple" objective: NTP. That's the key difference here. We have these "stochastic parrots" + RL + compute that basically solve top tier competitions in math, coding, and who knows what else... I think it's insanely good for what it is.
Oh totally! I think that the progress made in NLP, as well as the surprising collision of NLP with seemingly unrelated spaces (like ICPC word problems) is nothing sort of revolutionary. Nevertheless I also see stuff like this: https://dynomight.substack.com/p/chess
To me this suggests that this out-of-domain performance is more like an unexpected boon, rather than a guarantee of future performance. The "and who knows what else..." is kind of I'm getting: so far we are turning out to be bad at predicting where these tools are going to excel or fall short. To me this is sort of where the "wall" stuff comes from; despite all the incredible successes in these structured problem domains, nobody (in my personal opinion) has really unlocked the "killer app" yet. My belief is that by accepting their limitations we might better position ourselves to laser-target LLMs at the kind of things they rule at, rather than trying to make them "everything tools".
Indeed in seems in most language model RL there is not even process supervision, so a long way from NTP
If GPT-5, as claimed, is able to solve all problems in ICPC, please give the instructions on how I can reproduce it.
I will say that after checking, I see that the model is set to "Auto", and as mentioned, used almost 8 minutes. The prompt I used was:
It did a lot of thinking, including And I can see that it visited 13 webpages, including icpc, codeforces, geeksforgeeks, github, tehrantimes, arxiv, facebook, stackoverflow, etc.I don't know what Deepmind and OpenAI did in this case, but to get an idea of the kind of scaffolding and prompting strategy that one might want, have a look at this paper where some floks used the normal generally available Gemini Pro 2.5 to solve 5/6 of the 2025 IMO problems: https://arxiv.org/pdf/2507.15855
Call it the “shoelace fallacy”: Alice is supposedly much smarter but Bob can tie his shoelaces just as well.
The choice of eval, prompt scaffolding, etc. all dramatically impact the intelligence that these models exhibit. If you need a PhD to coax PhD performance from these systems, you can see why the non-expert reaction is “LLMs are dumb” / progress has stalled.
I think the contradiction here can be reconciled by how these tests don’t tend to run on the typical hardware constraints they need to be able do this at scale. And herein lies a large part of the problem as far as I can tell; in late 2024, OpenAI realized they had to rethink GPT-5 since their first attempt became too costly to run. This delayed the model and when it finally released, it was not a revolutionary update but evolutionary at best compared to o3. Benchmarks published by OpenAI themselves indicated a 10% gain over o3 for God knows how much cash and well over a year of work. We certainly didn’t have those problems in 2023 or even 2024.
DeepSeek has had to delay R2, and Mistral has had to delay Mistral 3 Large, teased within weeks back in May. No word from either about what’s going on. DS is said to move more to Huawei and this is behind a delay but I don’t think it’s entirely clear it has nothing to do with performance issues.
It would be more strange to _not_ have people speculate about stagnation or bubbles given these events and public statements.
Personally, I’m not sure if stagnation is the right word. We’re seeing a lot,of innovation in toolsets and platforms surrounding LLM’s like Codex, Claude Code, etc. I think we’ll see more in this regard and that this will provide more value than the core improvements to the LLM’s themselves in 2026.
And as for the bubble, I think we are in one but mostly because the market has been so incredibly hot. I see a bubble not because AI will fall apart but because there are too many products and services right now in a golden rush era. Companies will fail but not because AI suddenly starts failing us but due to saturation.
It is a revolutionary update if compared to the previous major release (GPT-4 from March 2023).
If you look at the details of how Google got gold at IMO, you'll see that AlphaGeometry only relies on LLMs for a very specific part of the whole system, and the LLM wasn't the core problem solving system in play.
Most of AlphaGeometry is standard algorithms at play solving geometry problems using known constraints. When the algorithmic system gets stuck, it reaches out to LLMs that were fine tuned specifically for creating new geometric constraints. So the LLM would create new geometric constraints and pass that back to the algorithmic parts to get it unstuck, and repeat.
Without more details, it's not clear if this win is also the Gpt-5 and Gemini models we use, or specially fine-tuned models that are integrated with other non-LLM and non-ML based systems to solve these.
Not being solved purely by LLM isn't a knock on it, but with the current conversations going on today with LLMs, these are heavily being marketed as "LLMs did this all by themselves", which doesn't match with a lot of the evidence I've personally seen.
[1]https://deepmind.google/discover/blog/advanced-version-of-ge...
I personally view all this stuff as noise. Im more interested in seeing any contributions to the real economy. Not some competition stuff that is irrelevant to the welfare of people.
That seems...highly implausible?
Example: During parking, which I witness daily in my building, it happens all the time.
1. Car gets stuck trying to park, blocking either the garage or a whole SF street 2. A human intervenes, either in person (most often) or seemingly remotely, to get the car unstuck.
Can you explain how a human intervenes in person?
Do you mean these cars have a human driver on board? Or the passenger drives? Or another car drops off a driver? Or your car park is such an annoying edge case that a driver hangs around there all the time just to help park the cars?
The paperclip trivial solution!
I had a class of 5 or so test methods - ABCDE. I asked it to fix C, so it started typing out B token-by-token underneath C, such that my source file was now ABCBDE.
I don't think I'm smart enough to get it to do coding activities.
I don't know if we're in a bubble for model capabilities, but we are definitely hitting the wall in terms of what the rest of the physical economy can provide.
You can't undo 50 years of deffered maintenance in three months.
What happens when OpenAI and friends go bust because China is drowning in spare grid capacity and releasing sota open weights models like R1 every other week?
Every company building infrastructure for AI also goes out of business and we are in a worse position than we are now because instead of having a tiny industry building infrastructure at a level required to replace what has reached end of life we have nothing.
In 2016 Geoffrey Hinton said vision models would put radiologists out of business within 5-10 years. 10 years on there is a shortage of Radiologists in the US and AI hasn't disrupted the industry.
The DARPA grand challenge for autonomous vehicles was won in 2006, 20 years on self driving cars still have limited deployment.
The real world is more complex than computer scientists apprecate.
Also I think people do understand just how big of a deal AI is but don't want to accept it or at least publicly admit it because they are scared for a number of reasons, least of all being human irrelevance.
this is narrow niche with high amount of training data (they all buy training data from leetcode), and this results are not necessary generalizable on overall industrial tasks
The sibling commenter compared this to go, but we could go back to comparing it with chess. Deepblue didn't play chess the way a human did. It deployed massive amounts of compute, to look at as many future board states as possible, in order to see which move would work out. People who said that a computer that could play chess as well as a human would be as smart as a human ended up eating crow. These modern AIs are also not playing these competitions the way a human does. Comparing their intelligence to that of a humans is similarly fallacious.
I think this is huge news, and I cannot imagine anything other than models with this capability having a massive impact all over the world. It causes me to be more worried than excited, it is very hard to tell what this will lead which is probably what makes it scary for me.
However with so little transparency from these companies and extreme financial pressure to perform well in these contests, I have to be quite sceptical of how truthful these results are. If true I think it is really remarkable, but I really want some more solid proof before I change my worldview.
This is helpful in framing the conversation, especially with "skeptics" of what these models are capable of.
Without any of this I can't even know for sure if there was any human intervention. I don't really think so, but as I mentioned the financial pressure to perform well is extreme so I can totally see that happening. Maybe ICPC did have some oversight, but please write a bit about it then.
If you assume no human intervention then all of this is of course irrelevant if you only care about the capabilities that exist. But still the implications of a general model performing at this level vs something more like a chess model trained specifically on competitive programming are of course different, even if the gap may close in the future. And how much compute/power was used, are we talking hundreds of kWhs? And does that just means larger models than normally or intelligent bruteforcing through a huge solutionspace? If so, then it is not clear how much they will be able to scale down the compute usage while keeping the performance at the same level
It thought for 7m 53s and gave as reply
-
You are a gold level math olympiad competitor participating in the ICPC 2025 Baku competition. You will be given a competitive programming problem to solve completely.
All problems are located at the following URL: https://worldfinals.icpc.global/problems/2025/finals/problem...
Here is the problem you need to solve and only solve this problem:
<problem> Problem B located on Page 3 of the PDF that starts with this text - but has other text so ensure you go to the PDF and look at all of page 3
To help her elementary school students understand the concept of prime factorization, Aisha has invented a game for them to play on the blackboard. The rules of the game are as follows.
The game is played by two players who alternate their moves. Initially, the integers from 1 to n are written on the blackboard. To start, the first player may choose any even number and circle it. On every subsequent move, the current player must choose a number that is either the circled number multiplied by some prime, or the circled number divided by some prime. That player then erases the circled number and circles the newly chosen number. When a player is unable to make a move, that player loses the game.
To help Aisha’s students, write a program that, given the integer n, decides whether it is better to move first or second, and if it is better to move first, figures out a winning first move.</problem>
Your task is to provide a complete solution that includes: 1. A thorough analysis and solution approach 2. Working code implementation 3. Unit test cases with random inputs 4. Performance optimization to run within 1 second
Use your scratchpad to think through the problem systematically before providing your final solution.
<scratchpad> Think through the following steps:
1. Problem Understanding: - What exactly is the problem asking for? - What are the input constraints and output requirements? - Are there any edge cases to consider?
2. Solution Strategy: - What algorithm or mathematical approach should be used? - What is the time complexity of your approach? - What is the space complexity? - Will this approach work within the given constraints?
3. Implementation Planning: - What data structures will you need? - How will you handle input/output? - What are the key functions or components?
4. Testing Strategy: - What types of test cases should you create? - How will you generate random inputs within the problem constraints? - What edge cases need specific testing?
5. Optimization Considerations: - Are there any bottlenecks in your initial approach? - Can you reduce time or space complexity? - Are there language-specific optimizations to apply? </scratchpad>
Now provide your complete solution with the following components:
<analysis> Provide a detailed analysis of the problem, including: - Problem interpretation and requirements - Chosen algorithm/approach and why - Time and space complexity analysis - Key insights or mathematical observations </analysis>
<solution> Provide your complete, working code solution. Make sure it: - Handles all input/output correctly - Implements your chosen algorithm efficiently - Includes proper error handling if needed - Is well-commented for clarity </solution>
<unit_tests> Create comprehensive unit test cases that: - Test normal cases with random inputs within constraints - Test edge cases (minimum/maximum values, boundary conditions) - Include at least 5-10 different test scenarios - Show expected outputs for each test case </unit_tests>
<optimization> Explain any optimizations you made or could make: - Performance improvements implemented - Memory usage optimizations - Language-specific optimizations - Verification that solution runs within 1 second for maximum constraints </optimization>
Take all the time you need to solve this problem thoroughly and correctly.
https://www.acmicpc.net/problem/33797
I have the 20$ plan and I think I found a weird bug, at least with the thinking version. It gets stuck in the same local minima super quickly, even though the "fake solution" is easily disproved on random tests.
It's at the point where sometimes I've fed it the editorial and it still converges to the fake solution.
https://chatgpt.com/share/68c8b2ef-c68c-8004-8006-595501929f...
I'm sure that the model is capable of solving it, but seriously I've tried across multiple generations (since about when o3 came out) to get GPT to solve this problem and it's not hampered by its innate ability I don't think, it literally just refuses to think critically about the problem. Maybe with better prompting it doesn't get stuck as hard?
EDIT: Just submitted it, WA. Yeah.
Problems are hard enough where consumer models can't solve all 12 problems.
The fact is most ordinary mortals never get access to a fraction of that kind of power, which explains the commonly reported issues with AI models failing to complete even rudimentary tasks. It's now turned into a whole marketing circus (maybe to justify these ludicrous billion-dollar valuations?).
“The fact is most ordinary mortals never get access to a fraction of that kind of power”
A certain amount of input and output tokens doesn't cost 10x less than before.
(I’m a former ICPC competitor)
I believe it was Sundar in an interview with Lex who said that the reason they haven't developed another Ultra model is because by the time it is ready to launch, the flash and pro versions will have already made it redundant.
Yes theres an entire ecosystem being built up around language models that has to stay afloat for another 5 years at least, to hope for a significant breakthrough.
> our OpenAI reasoning system got a perfect score of 12/12
> For 11 of the 12 problems, the system’s first answer was correct. For the hardest problem, it succeeded on the 9th submission. Notably, the best human team achieved 11/12.
> We had both GPT-5 and an experimental reasoning model generating solutions, and the experimental reasoning model selecting which solutions to submit. GPT-5 answered 11 correctly, and the last (and most difficult problem) was solved by the experimental reasoning model.
I'm assuming that "GPT-5" here is a version with the same model weights but higher compute limits than even GPT-5 Pro, with many instances working in parallel, and some specific scaffolding and prompts. Still, extremely impressive to outperform the best human team. The stat I'd really like to see is how much money it would cost to get this result using their API (with a realistic cost for the "experimental reasoning model").
Hopefully that prompt was the same for all questions (I think that is what they did for the IMO submission, or maybe it was Google that did that, not sure).
What's the judgement here? Was it within the allotted time, or just a "try as often as you need to"?
But submitting a non-working solution gives you a time penalty (usually 20 mins). Yet this time penalty only applies if in the end, you actually solve the problem. So it never hurts to try.
Nonetheless, I'm still questioning what's the cost and how long it would take for us to be able to access these models.
Still great work, but it's less useful if the cost is actually higher than hiring someone with the same level.
How do you compare those?
There were at least 2 very simple problems in IOI this year.
I haven't read the ICPC problem set, and perhaps there are some low-hanging fruits, but I highly doubt it.
Another evidence is that you only have 5 hours to solve 3 problems in IOI, but you need to solve 10+ problems in ICPC. It's impossible to have all 10+ problems to at IOI level in ICPC.
Isn't getting a medal a function of your ranking, not score, in both cases? If so, that does not prove much about the difficulty of either.
Yes, medal is function of ranking but not difficulty.
Nonetheless, I would say that IOI more focus on thinking, which I to some degree is not that good at, while ICPC is more like a mix thinking and implementing. Therefore, my ability to implement stuff can improve my ICPC ranking but not IOI.
Sure, you need to be individually good at thinking, etc. But the difference between 1st and places further down the ranking is teamwork.
(In a certain sense, this is actually the ideal "teamwork" setup in the industry as well, to have a bunch of people who own their part and are trusted by their colleagues take care of it and not step on each other toes than kumbaya let's all get together on the same problem.)
We were "just" three friends who had studied together for 4 years, knew each other's strengths and weaknesses intimately, and then for the comp trained intensively on optimising the "parallelization/scheduling" aspects (as you put it) to get the best score in the minimum time. That included both the logistical and mental aspects of recovering from setbacks midway through the 5 hour problem sets.
During the finals, you'd be surprised how many teams' teamwork we saw fall apart when three very smart people under intense time pressure hit unexpectedly failing submissions with the bottleneck of a single computer. ICPC is a genius format.
I'd go as far as saying that gold at the IOI is probably easier than getting an ICPC medal. (One is individual and the other is in teams, but my point stands).
Totally agree that IOI bronze is way easier than ICPC bronze. In terms of rank/ratio, ICPC medals are more like IOI gold.
I stated things like that because I thought it's a bit easier to let people know the difficulty difference. (Agree weirdly though)
Doesn't say anything about the difficulty of the questions themselves though.
For fractional scores, it depends on problems. In short, there are two types of problems in IOI. One is traditional problems that requires 100% correctness, and the other is continuous scoring.
The prior can still results in score between 0 and 100, but this is because there are subtasks in the problem. For example, a graph become a tree or even just a linear sequence. Nonetheless, you still need to ensure your algorithm is correct on all testcases in that subtask in order to get the score of that task.
Essentially, we need to poison AI in all possible ways, without impacting human reading. They either have to hire more humans to filter the information, or hire more humans to improve the crawlers.
Or we can simply stop sharing knowledge. I'm fine with it, TBF.
> AI companies are not paying anyone for that piece of information
So? For the vast majority of human existence, paying for content was not a thing, just like paying for air isn't. The copyright model you are used to may just be too forced. Many countries have no moral qualms about "pirating" Windows and other pieces of software or games (they won't afford to purchase anyway.) There's no inherent morality or entitlement for author receiving payment for everything they "create" (to wit, Bill Gates had to write a letter to Homebrew Computer Club to make a case for this, showing that it was hardly the default and natural viewpoint.) It's just a legal/social contract to achieve specific goals for the society. Frankly the wheels of copyright have been falling off since the dawn of the Internet, not LLM.
For the majority of interesting output people have paid for art, music, software, journalism. But you know that already and are justifying the industry that pays your bills.
Why whould anyone think that these companies will contribute to the good of humanity when they are even bigger and more powerful, when they seem to care so little now?
Have you seen the people who do OpenAI demos? It becomes pretty apparent upon inspection, what is driving said people.
Irrelevant really. Invoking this in the argument shows the basis is jealousy. They are clearly valued as such not because they collected all the data and stored in some database. Your local library is not worth 300 billion.
> For the majority of interesting output people have paid for art, music, software, journalism
Absolutely and demonstrably false. Music and art predate Copyright by hundreds if not thousands of years.
> But you know that already and are justifying the industry that pays your bills.
Huh, ad hominem much? I find it rich that the whole premise of your argument was some "art, music, software, journalist" was entitled to some payment, but suddenly it is a problem when "my industry" (somehow you assume I work in AI) is getting paid?
And as I said, art was always paid for. In the case of monarchies, at least their advisers usually had good taste, unlike rich people today.
I have received free advice that reduced future need from such actual plumbers (and mechanics and others for that matter)
> we should really stop giving free advice and training to these robots
People routinely freely give advice and teach students, friends, potential competitors, actual competitors, etc on this same forum. Robots? Many also advocate for immigration and outsourcing, presumably because they make the calculus that it is net beneficial in some scenarios. People on this forum contribute to an entire ecosystem of free software, on top of which two kids can and have built $100 billion companies that utilize all such technology freely and without cost. Let's ban it all?
Sure, I totally get if you want to make an individual choice for yourself to keep a secret sauce, not share your code, put stuff behind paywall. That is not the tone and the message here. There is some deep animosity advocating for everyone shutting down their pipes to AI as if some malevolent thing, similar to how Ted Kaczynski saw technology at large.
but the companies operating it certainly are
they have no concept of consent
they take anything and everything, regardless of copyright or license, with no compensation to the authors
and then use it to directly compete with those they ripped off
not to mention shoving their poor quality generated slop everywhere they can possibly manage, regardless of ethics, consent or potential consequences
children should not be supplied a sycophantic source of partial truths that has been instructed to pretend to be their friend
this is text book malevolence
Which ones in particular? Is your belief all that are companies are inherently malevolent? If not why don't you start one that is not? What's stopping you?
Because the one I start will be beaten by the one that is malevolent if they have a weapon that is as powerful as AI. All these arguments about "we shared stuff before so what's the problem?" are missing the point. The point is that this is about the concentration of power. The old sharing was about distribution of power.
> What's stopping you?
from doing what?
I don't want shitty AI slop; why would I start a company intent on generating it?
Don't waste the mental energy. They're more interested in performative ignorance and argument than anything productive. It's somewhere between trying to engage Luddites during the industrial revolution and having a reasonable discussion with /pol/ .
They'd rather cling to what they know than embrace change, or get in rhetorical zingers, and nothing will change that except a collision with reality.
There's the AI industry, which you engaged with, which is more or less a flailing attempt to capitalize on the new technology, and which yields some results but has seen quite a staggering number of flops.
There's also the AI technology - progress in AI is on an exponential trend, tied to Moore's law, and has trillions of dollars of impetus in play, nearly completely decoupled from the market in general - I think we'll see at least a decade of progress increasingly accelerating, with massive world models and language models built on current architectures, but from a technical point of view, I believe we're only a couple breakthroughs from getting a truly general architecture.
The worst case scenario for AI is having to wait on sensor technologies and scans of the human brain. At some point, we will have a good enough, explicable, and analyzed model of human neural function and connectomes to create AI models that operate in the same way that the brain processes information.
We're probably 20 years or less from that point - the reason I say this is because of the fact that nearly all brain tissue is generalized - you don't have one type of mechanism for sight, another for thinking, another for feeling happy, another for remembering things - it all runs on the same basic substrate. Every time we map out a cubic millimeter of tissue, we're making progress towards understanding the algorithms by which we experience and process the world.
On the software, side, though, I suspect we're within a few years - one person with a profound insight will be able to make the leap between whatever it is that humans do and the way in which some AI model is processing, and put that insight into algorithmic form. There might be multiple insights along the path, but it is undeniable that progress is happening, and that the rate of progress is increasing day by day. AI just might already be capable enough to make that last little leap without human intervention.
We're in brute force territory, with massive ChatGPT and Grok models requiring billions of dollars of infrastructure and systems in place to achieve.
In 20 years, stuff like that will be achievable by an ambitious high school computer lab, or a rich nerd building things for kicks.
You can effectively put all of the text of the internet onto a dozen 2TB microSD cards. Throw in the pirate data sources, like scihub, pirated books, all the text out there, and maybe it'll take 20 of those microSD cards. $5k or less and you can store and own it all.
A phone in 2045 will have compute, throughput, and storage comparable with a state of the art GPU and server today, and we're likely to optimize in the direction of AI hardware between now and then.
The current AI startup bubble is going to collapse, no doubt, because LLMs aren't the right tool for many jobs, and frontier models keep eating the edge cases and niches people are trying to exploit. We'll see it settle into a handful of big labs, but those big labs will continue chugging along.
I'm not betting on stagnation, though, and truly believe we're going to see a handful of breakthroughs that will radically improve the capabilities of AI in the near term.
Books were bought, teachers were paid so no, for most of human history information was not free.
It is an idiotic benchmark, in line with the rest of the "AI" propaganda.
It is like an open book exam for humans where they also can lookup similar problems.
The current top comment makes the same point, but in a more diplomatic and sophisticated manner.
Or due to its massive internal database combined with superhuman button pressing skills?
This means that you have to be smart about who is going to spend time coding, thinking, or debugging. The time pressure is intense, and it really is a team sport.
It's also extra fun if one of the team members prefers a Dvorak keyboard layout and vi, and the others do not.
I wonder how three different AI vendors would cooperate. It would probably lift reinforcement learning to the next level.
I'm not sure how it would play out, but at least when you let them talk to each other they tend to get very technical very fast.
Apparently Gemini solved one problem (running on who knows what kind of cluster) by burning 30 min of "thinking" time on it, and at a cost that Google have declined to provide.
According to one prior competition paricipant, writing in the comments section of this ArsClasica coverage, each year they include one "time sink" problem that smart humans will avoid until they have tackled everything else.
https://arstechnica.com/google/2025/09/google-gemini-earns-g...
This would all seem to put a rather different spin on this. It's not a case of Google outwitting the worlds best programmers, but rather that by searching for solutions for 30 min on god knows what kind of cloud hardware, they were able to get something done that the college kids did not have time to complete, or deem worthwhile starting.
https://icpc.global/regionals/rules
I don't know what you mean by "elite", and there are certainly plenty of teams at the World Finals that are not especially competitive, and there certainly many elite programers who don't qualify for various reasons (most obviously by being the wrong age or not in the right stage of school or having already attended too many times), but I find it hard to believe that there aren't enough "elite" programmers present to make the winning teams be genuinely elite.
Compare to, say, the Olympics or pretty much any academic olympiad. There are many people and teams at the Olympics who are not remotely competitive with the winners.
And yet, they are so much closer to the winners than the people that came 11th, 12th etc.
Every year there are multiple "Legendary Grandmasters" in the competition. That's >3000 Elo in Codeforces. I'd estimate it takes a similar level of skill/effort as becoming a Chess Grandmaster.
And even those that aren't at that level are very competent at it. The average ICPC participant is likely "smarter" than the average MIT/Harvard CS student for some reasonable measure of "smarter".
ICPC finalists are very much in the world elite of competitive programmers.
note: my team only passed the first 2 rounds, far from bragging about my skills here :)
First, this is really impressive.
Second, with that out of the way, these models are not playing the same game as the human contestants, in at least two major regards. First, and quite obviously, they have massive amounts of compute power, which is kind of like giving a human team a week instead of five hours. But the models that are competing have absolutely massive memorization capacity, whereas the teams are allowed to bring a 25-page PDF with them and they need to manually transcribe anything from that PDF that they actually want to use in a submission.
I think that, if you gave me the ability to search the pre-contest Internet and a week to prepare my submissions, I would be kind of embarrassed if I didn't get gold, and I'd find the contest to be rather less interesting than I would find the real thing.
I don't know what your personal experience with competitive programming is, so your statement may be true for yourself, but I can confidently state that this is not true for the VAST majority of programmers and software engineers.
Much like trying to do IMO problems without tons of training/practice, the mid-to-hard problems in the ICPC are completely unapproachable to the average computer science student (who already has a better chance than the average software engineer) in the course of a week.
In the same way that LLMs have memorized tons of stuff, the top competitors capable of achieving a gold medal at the ICPC know algorithms, data structures, and how to pattern match them to problems to an extreme degree.
That may well be true. I think it's even more true in cases where the user is not a programmer by profession. I once watched someone present their graduate-level research in a different field and explain how they had solved a real-world problem in their field by writing a complicated computer program full of complicated heuristics to get it to run fast enough and thinking "hmm, I'm pretty sure that a standard algorithm from computer graphics could be adapted to directly solve your problem in O(n log n) time".
If users can get usable algorithms that approximately match the state of the art out of a chatbot (or a fancy "agent") without needing to know the magic words, then that would be amazing, regardless of whether those chatbots/agents ever become creative enough to actually advance the state of the art.
(I sometimes dream of an AI producing a piece of actual code that comes even close to state of the art for solving mixed-integer optimization problems. That's a whole field of wonderful computer science / math that is mostly usable via a couple of extraordinarily expensive closed-source offerings.)
Take a look at Google OR-Tools: https://developers.google.com/optimization/
Compute and such is a fair point but that AI is here at all is mind-blowing to me.
The fact that they don't disclose the cost is a clue that it's probably outrageous today. But costs are coming down fast. And hiring a team of these guys isn't exactly cheap either.
Google source post: https://deepmind.google/discover/blog/gemini-achieves-gold-l... (https://news.ycombinator.com/item?id=45278480)
OpenAI tweet: https://x.com/OpenAI/status/1968368133024231902 (https://news.ycombinator.com/item?id=45279514)
88 more comments available on Hacker News