Recursive Language Models

Posted5d agoActive4d ago

schmuhblaster

138 points

23 comments

arxiv.orgResearchstory

informativeneutral

Debate

20/100

Conversational UIArtificial IntelligenceMachine Learning

Key topics

Conversational UI

Artificial Intelligence

Machine Learning

The debate around recursive language models is heating up, with researchers exploring the idea of treating long prompts as part of the environment that a large language model (LLM) can interact with symbolically. Commenters are drawing parallels to existing techniques like Retrieval-Augmented Generation (RAG), but highlighting key differences, such as the recursive nature of this new approach and its more "agentic" behavior. As one commenter quipped, it's "LLMs all the way down," while others are calling for greater transparency around techniques like compaction, with some even speculating about potential implementations. The discussion is sparking interesting insights into the evolving landscape of LLMs and their potential applications.

Snapshot generated from the HN discussion

Discussion Activity

Light discussion

First comment

Peak period

5-6h

Avg / period

2.4

Comment distribution24 data points

Loading chart...

Based on 24 loaded comments

Key moments

01Story posted
Jan 3, 2026 at 6:29 AM EST
5d ago
Step 01
02First comment
Jan 3, 2026 at 11:51 AM EST
5h after posting
Step 02
03Peak activity
4 comments in 5-6h
Hottest window of the conversation
Step 03
04Latest activity
Jan 3, 2026 at 10:06 PM EST
4d ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (23 comments)

Showing 24 comments

zed31726

5d ago

1 reply

T̶u̶r̶t̶l̶e̶s̶ LLMs all the way down

downboots

4d ago

attention is all you need but over and over and over and over... Precision is what we should ask for.

bob1029

5d ago

1 reply

> The key insight is that long prompts should not be fed into the neural network (e.g., Transformer) directly but should instead be treated as part of the environment that the LLM can symbolically interact with.

How is this fundamentally different from RAG? Looking at Figure 4, it seems like the key innovation here is that the LLM is responsible for implementing the retrieval mechanism as opposed to a human doing it.

NitpickLawyer

5d ago

1 reply

Two differences that I see:

1. RAG (as commonly used) is more of a workflow, this thing is more "agentic"

2. The recursive nature of it

First, the way I see workflow vs. agentic: the difference is where the "agency" is. In a workflow, the coder decides (i.e. question -> embed -> retrieve -> (optional) llm_call("rerank these parts with the question {q} in mind") -> select chunks -> llm_call("given question {q} and context {c}, answer the question to the best of your knowledge") )

The "agentic" stuff has the agent decide what to search for, how many calls to make and so on, and it then decides when to answer (i.e. if you've seen claude code / codex work on a codebase, you've seen them read files, ripgrep a repo, etc).

The second thing, about recurrence has been tried before (babyagi was one of the first that I remember, circa '23) but the models weren't up to it. So there was a lot of glue around them to make them kinda sorta work. Now they do.

alansaber

4d ago

The terminology we use is rather imprecise, the interpretation of RAG inflates year on year

mccoyb

5d ago

1 reply

My wishlist for 2026: Anthropic / OpenAI expose “how compaction is executed” to plugin authors for their CLI tools.

This technique should be something you could swap in for whatever Claude Code bakes in — but I don’t think the correct hooks or functionality is exposed.

rockwotj

4d ago

1 reply

Isn’t codex open source and you can just go read what they do?

I have read the gemini source and it’s a pretty simple prompt to summarize everything when the context window is full

MillionOClock

4d ago

1 reply

It should be noted that OpenAI now has a specific compaction API which returns opaque encrypted items. This is AFAICT different from deciding when to compact, and many open source tools should indeed be inspectable to that regard.

omneity

4d ago

It's either an approach like this [0] or something even less involved.

0: https://github.com/apple/ml-clara

Legend2440

5d ago

5 replies

Isn't this just subagents? You call another LLM to go read a file and extract some piece of information or whatever, so that you don't clutter up the main context with the whole file.

Neat idea, but not a new idea.

lelanthran

5d ago

1 reply

Unless that subagebt you call can call subagents itself which can call subagents themselves, ad infinitum, it's not recursive.

songodongo

5d ago

1 reply

The paper says they used a recursive depth of 1. Does that mean subagents or sub-subagents?

johnnyfived

4d ago

A recursive depth of 1? So it's just subagents..? How exactly can this be described as recursive then?

seeknotfind

5d ago

1 reply

Yeah, from the title, it sounds like perhaps the entire operation is differentiable and therefore trainable as a whole model and that such training is done. However, upon close inspection, I can't find any evidence that more is done than calling the model repeatedly.

AlexCoventry

4d ago

1 reply

No, there's no training going on, here, as far as I can tell. E.g., they use GPT-5 as their base model. Also, AFAICT from a quick skim/search there's no mention of loss functions or derivatives, FWIW.

alextheparrot

4d ago

The derivative being a grad(ient) student sampling scaffolds against evals + qualitative observations: most prompt-based llm papers

wiesbadener

4d ago

1 reply

They state:

> RLMs are not agents, nor are they just summarization. The idea of multiple LM calls in a single system is not new — in a broad sense, this is what most agentic scaffolds do. The closest idea we’ve seen in the wild is the ROMA agent that decomposes a problem and runs multiple sub-agents to solve each problem. Another common example is code assistants like Cursor and Claude Code that either summarize or prune context histories as they get longer and longer. These approaches generally view multiple LM calls as decomposition from the perspective of a task or problem. We retain the view that LM calls can be decomposed by the context, and the choice of decomposition should purely be the choice of an LM.

nostrebored

4d ago

lol this is literally one of the only reason competent people are using subagents. it is literally

@summarizable(recursive=True)

def long_running_task(Subagent)

on my long horizon tasks, where the hierarchy is determined at agent execution time…

adagradschool

4d ago

1 reply

Yes! Contrary to the anthropomorphized subagents, I view them as ways of managing context primarily. I'm exploring this idea in Scope[0] to have observable subagents that recursively break down the task to avoid having to compact. One thing I haven't been able to figure out is how to evaluate/improve this planning step. I am using markdown files to encode heuristics for planning but it feels too unstructured for me to measure. Would love it if someone pointed me to some existing literature/projects around this idea!

[0] https://github.com/adagradschool/scope

schmuhblasterAuthor

4d ago

Hi, I stumbled on this article in my twitter feed and posted it because I found it to be very practical, despite the somewhat misleading title. (and I also don't like encoding agent logic in .md files). For my side project I am experimenting with describing agents / agentic workflows in a Prolog-based DML [1]

[1] https://www.deepclause.ai

daralthus

4d ago

sub-agents that access (and manipulate) the SAME context (a file system or variables in the REPL)

yawnxyz

5d ago

here's a more readable version: https://alexzhang13.github.io/blog/2025/rlm/

d4rkp4ttern

5d ago

I’m surprised in 2026 that an empirical paper doesn’t have an associated GitHub for the source code, which I see as a necessary (but not sufficient) condition to take a paper seriously.

cubefox

5d ago

Seems similar to this paper: https://arxiv.org/abs/2510.14826

View full discussion on Hacker News

ID: 46475395Type: storyLast synced: 1/4/2026, 7:01:50 AM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN