I Tried Gleam for Advent of Code
Key topics
The rise of LLMs is sparking a heated debate about the future of programming languages, particularly niche ones like Gleam, as commenters ponder whether a language's value lies in its ability to be learned by AI tools. Some worry that path dependence on popular languages and tools will stifle innovation, while others see opportunities for AI to learn and adapt to new languages. As one commenter noted, "we need online learning for our own code bases and macros," while another argued that functional languages will gain an edge as AI tools improve. The discussion highlights a tension between the potential for AI to drive language adoption and the practical concerns of job market demand.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
40m
Peak period
103
0-12h
Avg / period
16
Based on 160 loaded comments
Key moments
- 01Story posted
Dec 13, 2025 at 12:00 PM EST
20 days ago
Step 01 - 02First comment
Dec 13, 2025 at 12:40 PM EST
40m after posting
Step 02 - 03Peak activity
103 comments in 0-12h
Hottest window of the conversation
Step 03 - 04Latest activity
Dec 19, 2025 at 3:04 AM EST
15 days ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
A language doesn't have to be unique to still have a particular taste associated with its patterns and idioms, and it would unfortunate if LLM influence had the effect of suppressing the ability for that new style to develop.
The only problem I’ve ever had was on maybe 3 total occasions it’s added a return statement, I assume because of the syntax similarity with ruby
But those are exactly the same mistakes most humans make when writing bash scripts, which makes them inherently flaky.
Ask it to write code in a language with types, a “logical” syntax where there are no tricky gotchas, with strict types, and a compiler which enforces those rules, and while LLMs struggle to begin with, they eventually produce code which is nearly clean and bug free. Works much better if there is an existing codebase where they can observe and learn from existing patterns.
On the other hand asking them to write JavaScript and Python, sure they fly, but they confidently implement code full of hidden bugs.
The whole “amount of training data” is completely overblown. I’ve seen code do well even with my own made up DSL. If the rules are logical and you explain the rules to it and show it existing patterns, the can mostly do alright. Conversely there is so much bad JavaScript and Python code in their training data that I struggle to get them to produce code in my style in these languages.
it seems semi intuitive to me that a typesafe, functional programming language with referential transparency would be ideal if you could decompose a program to small components and code those.
Contrast with the likes of Swift - been around for years but it’s so bloated and obscure that coding agents (not just humans) have problems using it fully.
I have similar concerns to you - how well a language works with LLMs is indeed an issue we have to consider. But why do you assume that it's the volume of training data that drives this advantage? Another assumption, equally if not more valid IMO, is that languages which have fewer, well-defined, simpler constructs are easier for LLMs to generate.
Languages with sprawling complexity, where edge cases dominate dev time, all but require PBs data to be feasible.
Languages that are simple enough, with a solid unwavering mental model, can match LLMs strengths - and completely leap-frog the competition in accurate code gen.
By the developers own action of adding generics ultimately the golang team admits they were wrong or that generics are better. If gleam gets popular I think much of the same will occur.
There’s simply too much repeated code without generics. I tried writing a parser combinator in gleam and it wasn’t pretty.
Gleam doesn’t support interfaces. Not generics. You are completely right.
Haskell allows both sorts of generics. In Haskell parlance they call this higher-kinded polymorphism and the generic version of map they call fmap (as a method of the class Functor).
Which feels super strange, but doesn't seem to really be a problem, e.g. imagine a language where you'd write
Gleam wants you to write edit: found an article that explained this a bit better https://mckayla.blog/posts/all-you-need-is-data-and-function...Codegen is more and more rare these days, because languages have so many tools to help you write less code - like generics. LLMs could, theoretically, help you crank out similar repetitive implementations of things.
It might be the same with gleam, with first version in 2019 and 1.0 in 2024. The language authors might think they are either uneeded and lead to anti patterns, or are waiting to see the best way to implement them.
For those that don't know its also built upon OTP, the erlang vm that makes concurrency and queues a trivial problem in my opinion.
Absolutely wonderful ecosystem.
I've been wanting to make Gleam my primary language, but I fear LLMs have frozen programming language advancement and adoption for anything past 2021.
But I am hopeful that Gleam has slid just under the closing door and LLMs will get up to speed on it fast.
The comment you are replying to is correct, and you are incorrect.
All OTP APIs are usable as normal within Gleam, the language is designed with it in mind, and there’s an additional set of Gleam specific additions to OTP (which you have linked there).
Gleam does not have access to only a subset of OTP, and it does not have its own distinct OTP inspired OTP. It uses the OTP framework.
The library the parent links to says this:
> Not all Erlang/OTP functionality is included in this library. Some is not possible to represent in a type safe way, so it is not included.
Does this mean in practice that you can use all parts of OTP, but you might lose type checking for the parts the library doesn't cover?
What's the state of Gleam's JSON parsing / serialization capabilities right now?
I find it to be a lovely little language, but having to essentially write every type three times (once for the type definition, once for the serializer, once for the deserializer) isn't something I'm looking forward to.
A functional language that can run both on the backend (Beam) and frontend (JS) lets one do a lot of cool stuff, like optimistic updates, server reconciliation, easy rollback on failure etc, but that requires making actions (and likely also states) easily serializable and deserializable.
I'm waiting for something similar to serde in Rust, where you simply tag your type and it'll generate type-safe serialization and deserialization for you.
Gleam has some feature to generate the code for you via the LSP, but it's just not good enough IMHO.
Could you point to a solution that provides serde level of convenience?
We regularly collect feedback and haven’t got problems reported here, so your feedback saying otherwise would be a useful data point.
To be fair, Rust's proc macros are only locally optimal:
While they're great to use, they're only okay to program.
Your proc-macro needs to live in another crate, and writing proc macros is difficult.
Compare this to dependently typed languages og Zig's comptime: It should be easier to make derive(Serialize, Deserialize) as compile-time features inside the host language.
When Gleam doesn't have Rust's derivation, it leaves for a future where this is solved even better.
But also, you shouldn’t think of it as writing the same type twice! If you couple your external API and your internal data model you are greatly restricting your domain modelling cability. Even in languages where JSON serialisation works with reflection I would recommend having a distinct definition for the internal and external structure so you can have the optimal structure for each context, dodging the “lowest common decimator” problem.
Hi, what do people use to generate them, I found gserde
I wonder why this is preferred over codegen (during build), possibly using some kind of annotations?
> Elixir also provides for much OTP functionality via direct access to the Erlang libraries.
This is the norm in Gleam too! Gleam’s primary design constraint is interop with Erlang code, so using these libraries is straightforward and commonplace.
This can be just my lack of familiarity with the ecosystem though.
Gleam looks lovely and IMO is the most readable language that runs on the BEAM VM. Good job!
That aside, it is normal in Elixir to use Erlang OTP directly. Neither Elixir nor Gleam provides an entirely alternative API for OTP. It is a strength that BEAM languages call each other, not a weakness.
Why would that be the case? Many models have knowledge cutoffs in this calendar year. Furthermore I’ve found that LLMs are generally pretty good at picking up new (or just obscure) languages as long as you have a few examples. As wide and varied as programming languages are, syntactically and ideologically they can only be so different.
It’d be like inventing a new assembly language when everyone is writing code in higher level languages that compile to assembly.
I hope it’s not true, but I believe that’s what OP meant and I think the concern is valid!
- Simple semantics (e.g. easy to understand for developers + LLMs, code is "obviously" correct)
- Very strongly typed, so you can model even very complex domains in a way the compiler can verify
- Really good error messages, to make agent loops more productive
- [Maybe] Easily integrates with existing languages, or at least makes it easy to port from existing languages
We may get to a point where humans don't need to look at the code at all, but we aren't there yet, so making the code easy to vet is important. Plus, there's also a few bajillion lines of legacy code that we need to deal with, wouldn't it be cool if you could port (or at least extend it) it into some standardized, performant, LLM-friendly language for future development?
We're still in early days with LLMs! I don't think we're anywhere near the global optimum yet.
Because LLMs make it that much faster to develop software, any potential advantage you may get from adopting a very niche language is overshadowed by the fact that you can't use it with an LLM. This makes it that much harder for your new language to gain traction. If your new language doesn't gain enough traction, it'll never end up in LLM datasets, so programmers are never going to pick it up.
I feel as though "facts" such as this are presented to me all the time on HN, but in my every day job I encounter devs creating piles of slop that even the most die-hard AI enthusiasts in my office can't stand and have started to push against.
I know, I know "they just don't know how to use LLMs the right way!!!", but all of the better engineers I know, the ones capable of quickly assessing the output of an LLM, tend to use LLMs much more sparingly in their code. Meanwhile the ones that never really understood software that well in the first place are the ones building agent-based Rube Goldberg machines that ultimately slow everyone down
If we can continue living in the this AI hallucination for 5 more years, I think the only people capable of producing anything of use or value will be devs that continued to devote some of their free time to coding in languages like Gleam, and continued to maintain and sharpen their ability to understand and reason about code.
* One developer tried to refactor a bunch of graph ql with an LLM and ended up checking in a bunch of completely broken code. Thankfully there were api tests.
* One developer has an LLM making his PRs. He slurped up my unfinished branch, PRed it, and merged (!) it. One can only guess that the approved was also using an LLM. When I asked him why he did it, he was completely baffled and assured me he would never. Source control tells a different story.
* And I forgot to turn off LLM auto complete after setting up my new machine. The LLM wouldn't stop hallucinating non-existent constructors for non-existent classes. Bog standard intellisense did in seconds what I needed after turning off LLM auto complete.
LLMs sometimes save me some time. But overall I'm sitting at a pretty big amount of time wasted by them that the savings have not yet offset.
So the LLM was not told how to run the tests? Without that they cannot know if what they did works, and they are a bit like humans, they try something and then they need to check if that does the right thing. Without a test cycle you definitely don’t get a lot out of LLMs.
The bigger story here is not that they forgot to tell the LLM to run tests, it's that agentic use has been so normalized and overhyped that an entire PR was attempted without any QA. Even if you're personally against this, this is how most people talk about agents online.
You don't always have the privilege of working on a project with tests, and rarely are they so thorough that they catch everything. Blindly trusting LLM output without QA or Review shouldn't be normalized.
You should be reviewing everything that touches your codebase regardless of source.
It's not hard to find comments from people vibe coding apps without understanding the code, even apps handling sensitive data. And it's not hard to find comments saying agents can run by themselves.
I mean people are arguing AGI is already here. What do you mean who is normalizing this?
And if you want to try... well you get what you get!
But again, no one who is serious about their business and serious about building useful products is doing this.
Listen, you can engage with the comment or ignore everything but the first sentence and throw out personal insults. If you don't want to sound like a shill, don't write like one.
When you're telling people the problem is the LLM did not have tests, you're saying "Yeah I know you caught it spitting out random unrelated crap, but if you just let it verify if it was crap or not, maybe it would get it right after a dozen tries." Does that not seem like a horribly ineffectual way to output code? Maybe that's how some people write code, but I evaluate myself with tests to see if I accidentally broke something elsewhere. Not because I have no idea what I'm even writing to begin with.
You wrote
> Without that they cannot know if what they did works, and they are a bit like humans
They are exactly not like humans this way. LLMs break code by not writing valid code to begin with. Humans break code by forgetting an obscure business rule they heard about 6 months ago. People work on very successful projects without tests all the time. It's not my preference, but tests are non-exhaustive and no replacement for a human that knows what they're doing. And the tests are meaningless without that human writing them.
So your response to that comment, pushing them further down the path of agentic code doing everything for them, smacks of maximalism, yes.
Where is everyone working where they can just ship broken code all the time?
I use LLMs for hours, every single day, yes sometimes they output trash. That’s why the bottleneck is checking the solutions and iterating on them.
All the best engineers I know, the ones managing 3-4 client projects at once, are using LLMs nonstop and outputting 3-4x their normal output. That doesn’t mean LLMs are one-shotting their problems.
You are overlooking a blind spot, that is increasingly becoming a weakness for devs. You assume that businesses care that their software actually works. It sounds crazy from the dev side but they really don't. As long as cash keeps hitting accounts the people in charge MBAs do not care how it gets there and the program to find that out only requires one simple unmistakable algo Money In - money out.
evidence
Spreadsheets. These DSL lite tools are almost universally known to be generally wrong and full of bugs. Yet, the world literally runs on them.
Lowest bidder outsourcing. Its well known that various low cost outsourcing produces non functional or failed projects or projects that limp along for years with nonstop bug stomping. Yet business is booming.
This only works in a very rich empire that is in the collapse/looting phase. Which we are in and will not change. See: History.
I once toured a dairy farm that had been a pioneer test site for Lasix. Like all good hippies, everyone I knew shunned additives. This farmer claimed that Lasix wasn't a cheat because it only worked on really healthy cows. Best practices, and then Lasix.
I nearly dropped out of Harvard's mathematics PhD program. Sticking around and finishing a thesis was the hardest thing I've ever done. It didn't take smarts. It took being the kind of person who doesn't die on a mountain.
There's a legendary Philadelphia cook who does pop-up meals, and keeps talking about the restaurant he plans to open. Professional chefs roll their eyes; being a good cook is a small part of the enterprise of engineering a successful restaurant.
(These are three stool legs. Neurodivergents have an advantage using AI. A stool is more stable when its legs are further apart. AI is an association engine. Humans find my sense of analogy tedious, but spreading out analogies defines more accurate planes in AI's association space. One doesn't simply "tell AI what to do".)
Learning how to use AI effectively was the hardest thing I've done recently, many brutal months of experiment, test projects with a dozen languages. One maintains several levels of planning, as if a corporate CTO. One tears apart all code in many iterations of code review. Just as a genius manager makes best use of flawed human talent, one learns to make best use of flawed AI talent.
My guess is that programmers who write bad code with AI were already writing bad code before AI.
Best practices, and then add AI.
If this does appear to become a problem, is it not hard to apply the same RLHF infrastructure that's used to get LLMs effective at writing syntactically-correct code that accomplishes sets of goals in existing programming languages to new ones.
That would make dense if LLMs understood the domains and the concepts. They don't. They need a lot of training data to "map" the "knowledge transfer".
Personal anecdote: Claude stopped writing Java-like Elixir only some time around summer this year (Elixir is 13 years old), and is still incapable of writing "modern HEEX" which changed some of the templaring syntax in Phoenix almost two years ago.
i wrote my own language, LLMs have been able to work with it at a good level for over a year. I don't do anything special to enable that - just front load some key examples of the syntax before giving the task. I don't need to explain concepts like iteration.
Also llm's can work with languages with unconventional paradigms - kdb comes up fairly often in my world (array language but also written right to left).
But consider: as LLMs get better and approach AGI you won't need a corpus: only a specification.
In this way, AI may enable more languages, not less.
More trial and error because trial is cheap, in the end less typing but hardly faster end results
Gleam can call any erlang function, and can somewhat handle the idc types. [ im sure it has another name ].
Did i miss something that gleam fails on, because this is one of my concerns.
I have been meaning to ask about that on the discord but its one of the ten thousand things on my backlog.
Maybe i could write a gen_event equivalent.. I have some code which does very similar things.
Thank you for taking the time to respond.
I'm sure at some point, Gleam will figure it all out.
Elixir is slowly rolling out set-theoretic typing: https://hexdocs.pm/elixir/main/gradual-set-theoretic-types.h...
Why use something complex and half working, when you can have the real thing?
BTW in the 90s the entire Haskell crew spent over a year trying to come up with a type system for Erlang, and failed:
--- start quote ---
Phil Wadler[1] and Simon Marlow [2] worked on a type system for over a year and the results were published in [3]. The results of the project were somewhat disappointing. To start with, only a subset of the language was type-checkable, the major omission being the lack of process types and of type checking inter-process mes-sages. Although their type system was never put into production, it did result in a notation for types which is still in use today for informally annotating types.
Several other projects to type check Erlang also failed to produce results that could be put into production. It was not until the advent of the Dialyzer [4] that realistic type analysis of Erlang programs became possible.
https://lfe.io/papers/%5B2007%5D%20Armstrong%20-%20HOPL%20II...
--- end quote ---
[1] Yes, that Philip Wadler, https://en.wikipedia.org/wiki/Philip_Wadler
[2] Yes, that Simon Marlow, https://en.wikipedia.org/wiki/Simon_Marlow
[3] A practical subtyping system for Erlang https://dl.acm.org/doi/10.1145/258948.258962
[4] https://www.erlang.org/doc/apps/dialyzer/dialyzer.html
Examples?
In fact, I'd say most of the Gleam code that has been generated has been surprisingly reliable and easy to reason about. I suspect this has to do with the static typing, incredible language tooling, and small surface area of the language.
I literally just copy the docs from https://tour.gleam.run/everything/ into a local MD file and let it run. Packages are also well documented, and Claude has had no issue looping with tests/type checking.
In the past month I've built the following, all primarily with Claude writing the Gleam parts:
- A websocket-first analytics/feature flag platform (Gleam as the backend): https://github.com/devdumpling/beacon
- A realtime holiday celebration app for my team where Gleam manages presence, cursor state, emojis, and guestbook writes (still rough): https://github.com/devdumpling/snowglobe
- A private autobattler game backend built for the web
While it's obviously not as well-trodden as building in typescript or Go or Rust, I've been really happy with the results as someone not super familiar with the BEAM/Erlang.
And I wonder if Gleam + Lustre could become the new Elm.
I have bumped into "the Elm architecture" in other projects though and it was nice.
I can't believe this is still up tbh. And I can't believe there's still people defending Elm's lack of development
> It’s true that there hasn’t been a new release of the Elm compiler for some time. That’s on purpose: it’s essentially feature-complete.
Last talk I saw by Evan Czaplicki (from the 2025 Scala Days conf) he seemed to be working on some sort of database language https://www.youtube.com/watch?v=9OtN4iiFBsQ
Why? (I'm one such person defending Elm's lack of development)
Right now I’m toying with the idea of building a GNOME application in rust, and the framework I’m using is relm4 which provides elm like abstractions over gtk-rs.
Previously I’ve built web applications with F# and elmish, which again provides elm like abstractions for building F# applications.
I'm a backend dev mostly and use Elm for all my frontend needs. Yes there are some things compiler-side that could be improved, but basically it's fine.
I appreciate not having to keep up with new releases!
Just so no one misunderstands this. The creator (Evan) didn't get into, or start, any drama himself that I ever noticed. I'd argue he's a very chill and nice dude.
I've been on the edges of the community for probably a decade now (lurker), and all of the drama came from other people who simply didn't like the BDFL and slow releases strategy.
What are you lacking in ruby and rails, besides the types?
Probably the most underrated aspect of this new AI age is having a tutor with encyclopedic knowledge available at all times.
* To be clear, I'm not saying to "Vibe Code" it. Take the time to really understand the code, ask questions, and eventually suggest improvements.
No disrespect to FE devs. Pretty much all software development is one type of mess or another. But backend and terminals are the kind of mess that make sense to me.
Also, agree that LLMs are actually great for learning if you use them carefully.
This is part of the appeal of Lustre and Elm to me. Not the main thing, but being able to avoid JS land churn (and nulls) is quite nice.
Seems to be a filesystem, how would it replace a database?
I'm hoping it succeeds and gets bigger because I really like its ergonomics.
43 more comments available on Hacker News