Launch HN: Hypercubic (YC F25) – AI for COBOL and Mainframes
I would expect most of these systems come with very carefully guarded access controls. It also strikes me as a uniquely difficult challenge to track down the decision maker who is willing to take the risk on revamping these systems (AI or not). Curious to hear more about what you’ve learned here.
Also curious to hear how LLMs perform on a language like COBOL that likely doesn’t have many quality samples in the training data.
The decision makers we work with are typically modernization leaders and mainframe owners — usually director or VP level and above. There are a few major tailwinds helping us get into these enterprises:
1. The SMEs who understand these systems are retiring, so every year that passes makes the systems more opaque.
2. There’s intense top-down pressure across Fortune 500s to adopt AI initiatives.
3. Many of these companies are paying IBM 7–9 figures annually just to keep their mainframes running.
Modernization has always been a priority, but the perceived risk was enormous. With today’s LLMs, we’re finally able to reduce that risk in a meaningful way and make modernization feasible at scale.
You’re absolutely right about COBOL’s limited presence in training data compared to languages like Java or Python. Given COBOL is highly structured and readable, the current reasoning models get us to an acceptable level of performance where it's now valuable to use them for these tasks. For near-perfect accuracy (95%+), that is where we see an large opportunity to build domain-specific frontier models purpose built for these legacy systems.
That’s exactly the opportunity we have in front us to make it possible through our own frontier models and infra.
here that person is a manager which got demoted from ~500 reports to ~40 and then convinced his new boss that it's good to reuse his team for his personal AI strategy which will make him great again.
Using AI and a few different modalities of information that exist about these systems (existing code, docs, AI-driven interviews, and workflow capture), we can triangulate and extract that tribal knowledge out.
2 you do a "line by line" reimplementation in Java (well banks like it).
3 you run the test suite and track your progress
4 when you get to 100 percent, you send the same traffic to both systems and shadow run the new implementation. Depending on how that goes you either give up, go back to work implementation or finally switch to the new system
This obviously is super expensive and slow to minimize any sort of risks for systems that usually handle billions or trillions of dollars.
Our focus is different: we’re using AI to understand these 40+ year-old black box systems and capture the knowledge of the SMEs who built and maintain them before they retire. There simply aren’t enough engineers left who can fully understand or maintain these systems, let alone modernize them.
The COBOL talent shortage has already been a challenge for many decades now, and it’s only becoming more severe.
There's a bunch of mainly legacy hospital and government (primarily VA) systems that run on it. And where there's big government systems, there's big government dollars.
Later I got into programming language theory, and took another look at MUMPS from that perspective. As a programming language, it’s truly terrible in ways that languages like COBOL and FORTRAN are not. Just as one example, “local” variables have an indefinite lifetime and are accessible throughout a process, i.e. they’re not scoped to functions. But you can dynamically hide/shadow and delete them. It would be hard to design a less tractable way of managing variables if you tried.
MUMPS’ value proposition was how it handled persistent data as a built-in part of the language. In that sense it was a precursor to systems like dBASE, which were eventually supplanted by SQL databases. MUMPS was a pretty good persistent data management system coupled with a truly terrible programming language.
I was curious to ask you, as domain experts, if you could talk more to the "70% of the Fortune 500 still run on mainframes" stat you mentioned.
Where do these numbers come from? And also, does it mean that those 70% of Fortune 500s literally run/maintain 1k-1M+ LoC of COBOL? Or do these companies depend on a few downstream specialized providers (financial, aviation/logistics, etc.) which do rely on COBOL?
Like, is it COBOL all the way down, or is everything built in different ways, but basically on top of 3 companies, and those 3 companies are mostly doing COBOL?
Thanks!
Generalizing — if the company had enough need for it 30 years ago, was big enough to but a mainframe and the thing they used it for barely changes — chances are it’s still there, if the company is still there.
Banks absolutely do have it in house, in a dedicated secure site with a fence and a moat
If they were large enough to need compute 30-40+ years ago, they certainly have some mainframes running today. Think Walmart, United Airlines, JPMC, Geico, Coca Cola and so on.
Out of those it looks like about ~60 of those are actually using COBOL in-house, while most job postings from the rest are mostly "has experience with dealing with legacy COBOL systems". The top ~40 users are the ones you would expect (big banks, insurances, telco).
Of course this is a very LinkedIn/job postings lens on it, but in terms of gauging how big the addressable market for such a solution may be, I think it should do a decent job.
[0]: https://sumble.com - Not affiliated, I just quite like their product
We also sublease our mainframes to at least 3 other ventures; one of which is very outspoken they have left the mainframe behind. I guess that's true if you view outsourcing as (literally) leaving it behind with the competitor of your new system... It seems to be the same for most banks, none of which are having mainframes anymore publicly, but for weird reasons they still hire people for it offshore.
Given that our (and IBM's!) services are not cheap I think either a) our customers are horribly dysfunctional in anything but earning money slow and steady (...) and b) they actually might depend on those mainframe jobs. So if you are IBM or a startup adding AI to IBM I guess the numbers might add up to the claims.
There are may be other general-purpose tools out there that overlap in some ways, but our focus is on vertically specializing in the mainframe ecosystem and building AI-native tooling specifically for the problems in this space.
Here's a talk about it:
https://www.youtube.com/watch?v=W8TSPED0alY
If you load the code referenced here, https://book.gtoolkit.com/analyzing-cobol--the-aws-carddemo-... , you can explore the demo used in the talk.
I'm sure you'll manage to figure out the LLM-integrations.
Edit: The Feenk folks also have a structured theory for why and how to do these things that they've spent a lot of time and experience on refining, visualising and developing tooling around.
I think it is a good idea for anyone working with large legacy systems to have such a theoretical foundation for how to communicate, approach problems and evaluate progress. Without it one is highly likely to make expensive decisions based on gut feeling and vague assumptions.
The only other player I've seen is Mechanical Orchard
Another proposed replacement about to fail now, after half the COBOL devs were laid off.
So if anyone needs a remote openvms/hp nonstop or junior z/os dev :D
The main reasons are the loss of institutional knowledge, the difficulty of untangling 20–30-year-old code that few understand, and, most importantly, ensuring the new system is a true 1:1 functional replica of the original via testing.
Modernization is an incredibly expensive process involving numerous SMEs, moving parts, and massive budgets. Leveraging AI creates an opportunity to make this process far more efficient and successful overall.
COBOL projects have millions of lines of code. Any prompt/reasoning will rapidly fill the context window of any model.
And you'll probably have a better luck if you had tokenization understands COBOL keywords.
You probably have better luck implementing a data miner that slowly digests all the code and requirements into a proprietary information retrieval solution or ontology that can help answer questions...
What an engineer tells you can be inaccurate, incomplete, outdated, etc.
Maybe 50-year-old COBOL programs are the original neural networks.
The real complexity lies in also understanding z/OS (mainframe operating systems), CICS, JCL, and the rest of the mainframe runtime, it’s an entirely parallel computing universe compared to the x86 space.
False, all those jobs were outsource and offshored long ago.
This is a problem a compiler cannot fix, and is a very real problem.
> HyperDocs ingests COBOL, JCL, and PL/I codebases to generate documentation, architecture diagrams, and dependency graphs.
Lots of tools available that do this already without AI.
> The goal is to build digital “twins” of the experts on how they debug, architect, and maintain these systems in practice.
That will be a neat trick, will the output be more than sparsely populated wiki?
My experience is there’s not a lot of Will or money to follow these things through.
Edit to add there was a lot of work around business rule extraction, automatic documentation, and static analysis of mainframe systems in 90s leading up to Y2K, but it all fizzled out after that. You all should search the literature if you haven’t.
How do you consolidate this knowledge across disparate teams and organizational silos? How will you identify and reconcile subtle differences in terminology used across the organization?
Perhaps I misunderstood, but on your website you primarily identify technical implementors as SMEs. IME modernizing legacy data systems in high-stakes environments, the devil is more on the business side -- e.g. disparate teams using the same term to refer to different concepts (and having that reflected in code), or the exact stakeholders of reports or data systems being unknown and unknowable, and discerning between rules that are critical to a particular team or workflow that are opaque to you because e.g. you don't know who all relies on this data or are missing business context, or because the rule is not actually used anymore, or because the implementation of the rule itself is wrong.
Besides, both technical and non-technical stakeholders and SMEs lean heavily on heuristics to decision with the data they are looking at, but often struggle to explicitly articulate them. They don't think to mention certain conditions or filters because for them those are baked into the terminology, or it doesn't occur to them that the organization deals with broader data than what they interact with in their day-to-day.
And unfortunately in these settings, you don't get many chances to get it wrong -- trust is absolutely critical.
I am skeptical that what you will end up with at the end of the day will be a product, at least if your intent is to provide meaningful value to people who rely on these systems and solve the problems that keep them up at night. My feeling is that you will end up as primarily a consultancy, which makes sense given that the problem you are solving isn't primarily technical in nature, it just has technical components.
Sounds great but... I have migrated a big cobol codebase several years ago. Knowledge stored in the experts is 1/ very wide 2/ full of special cases that pop up only a few times in a year 3/ are usually complex cases involving analysing data, files which are on paper, etc. I strongly doubt an AI will be able to spot that.
The knowledge that usually misses the most is not "how is that done", because spending a few hours on COBOL code is frankly not that hard. What misses is: "why". And that is usually stored in laws, sub-sub-laws, etc. You'll have to ingest the code and the law and pray the AI can match them.
So in the end the AI will make probably 50% of the effort but then you'll need 150% to understand the AI job... So not sure it balances well.
But if it works, well, that's cool because re-writing cobol code is not exactly funny: devs don't want to do it, customers do it because they have to (and not because it'll bring additional value) and the best outcome possible is the customer saysing to you "okay, we paid you 2mio and the new system does the same things as before we started" (the most likely outcome, which I faced, is "you rewrote the system and its worse than before). So if an I can do it, well, cool.
(but then it means you'll fire the team who does the migration which, althgouhg is not funny and not rocket science, requires real expertise; it's not grunt work at all)
The goal is to replace people to 'save money'. And I'm always amused at startup founders who so obviously never worked with real people in real environments (outside of the startup bubble) that they think people are too stupid to see this for what it is. I look forward to their explanation to their investors as to why their product didn't meet expectations because after taking 5-6 seconds to figure out what this new 'tool' was intended to do, the users spent all of their time figuring out how to feed it garbage so it didn't become their besty twin replacement.
We’re curious to hear your thoughts and feedback, especially from anyone who’s worked with mainframes or tried to modernize legacy systems.
Lol.
Surely not at a rate faster than one year per year?