Nov 22, 2025 at 6:27 AM EST

Agent design is still hard

the_mitsuhiko

334 points

190 comments

Mood

informative

Sentiment

neutral

Discussion Activity

Light discussion

First comment

Peak period

Hour 1

Avg / period

Comment distribution3 data points

Loading chart...

Based on 3 loaded comments

Key moments

01Story posted
Nov 22, 2025 at 6:27 AM EST
1d ago
Step 01
02First comment
Nov 23, 2025 at 8:04 AM EST
1d after posting
Step 02
03Latest activity
Nov 23, 2025 at 4:51 PM EST
9h ago
Step 03

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (190 comments)

Showing 3 comments of 190

d4rkp4ttern

16h ago

1 reply

Relatedly, I haven’t seen two issues discussed much in agent design, but repeatedly come up in real world use cases:

(1) LLM forgets to call a tool (and instead outputs plain text). Contrary to some of the comments here saying that these concerns will disappear as frontier models improve, there will always be a need for having your agent scaffolding work well with weaker LLMs (cost, privacy, etc).

(2) Determining when a task is finished. In some cases, we want the LLM to decide that (e.g search with different queries until desired info found), but in others, we want to specify deterministic task completion conditions (e.g., end the task immediately after structured info extraction, or after acting on such info, or after the LLM sees the result of that action etc).

After repeatedly running into these types of issues in production agent systems, we’ve added mechanisms for these in the Langroid[1] agent framework (I’m the lead dev), which has blackboard-like loop architecture that makes it easy to incorporate these.

For issue (1) we can configure an agent with a `handle_llm_no_tool` [2] set to a “nudge” that is sent back to the LLM when a non-tool response is detected (it could also be set as a lambda function to take other possible actions)

For issue (2) Langroid has a DSL[3] for specifying task termination conditions. It lets you specify patterns that trigger task termination, e.g.

- "T" to terminate immediately after a tool-call,

- "T[X]" to terminate after calling the specific tool X,

- "T,A" to terminate after a tool call, and agent handling (i.e. tool exec)

- "T,A,L" to terminate after tool call, agent handling, and LLM response to that

[1] Langroid https://github.com/langroid/langroid

[2] Handling non-tool LLM responses https://langroid.github.io/langroid/notes/handle-llm-no-tool...

[3] Task Termination in Langroid https://langroid.github.io/langroid/notes/task-termination/

d4rkp4ttern

9h ago

EDIT- forgot to mention the other part of issue (2): In cases where we want the LLM to decide task completion, in Langroid there's a DoneTool the LLM can use to signal completion. In general we find it useful to have orchestration tools for unambiguous control flow and message flow decisions by the LLM:

https://langroid.github.io/langroid/reference/agent/tools/or...

wrochow

18h ago

It seems to me that people are missing the point. AI is a tool, useful for some things, not so useful for others. The old adage applies here quite well. To a man with a hammer, everything is a nail.

Having used AI quite extensively I have come to realize that to get quality outputs, it needs quality inputs... this is especially true with general models. It seems to me that many developers today just start typing without planning. That may work to some degree for humans, but AI needs more direction. In the early 2000s, Rational Unified Process (RUP) was all the rage. It gave way to Agile approaches, but now, I often wonder if we didn't throw the baby out with the bath water. I would wager that any AI model could produce high-quality code if provided with even a light version of RUP documentation.

187 more comments available on Hacker News

View full discussion on Hacker News

ID: 46013935Type: storyLast synced: 11/23/2025, 12:07:04 AM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Read Article View on HN

Nov 22, 2025 at 6:27 AM EST

Agent design is still hard

the_mitsuhiko

334 points

190 comments

Mood

informative

Sentiment

neutral

Discussion Activity

Light discussion

First comment

Peak period

Hour 1

Avg / period

Comment distribution3 data points

Loading chart...

Based on 3 loaded comments

Key moments

01Story posted
Nov 22, 2025 at 6:27 AM EST
1d ago
Step 01
02First comment
Nov 23, 2025 at 8:04 AM EST
1d after posting
Step 02
03Latest activity
Nov 23, 2025 at 4:51 PM EST
9h ago
Step 03

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (190 comments)

Showing 3 comments of 190

d4rkp4ttern

16h ago

1 reply

Relatedly, I haven’t seen two issues discussed much in agent design, but repeatedly come up in real world use cases:

For issue (2) Langroid has a DSL[3] for specifying task termination conditions. It lets you specify patterns that trigger task termination, e.g.

- "T" to terminate immediately after a tool-call,

- "T[X]" to terminate after calling the specific tool X,

- "T,A" to terminate after a tool call, and agent handling (i.e. tool exec)

- "T,A,L" to terminate after tool call, agent handling, and LLM response to that

[1] Langroid https://github.com/langroid/langroid

[2] Handling non-tool LLM responses https://langroid.github.io/langroid/notes/handle-llm-no-tool...

[3] Task Termination in Langroid https://langroid.github.io/langroid/notes/task-termination/

d4rkp4ttern

9h ago

https://langroid.github.io/langroid/reference/agent/tools/or...

wrochow

18h ago

It seems to me that people are missing the point. AI is a tool, useful for some things, not so useful for others. The old adage applies here quite well. To a man with a hammer, everything is a nail.

187 more comments available on Hacker News

View full discussion on Hacker News

ID: 46013935Type: storyLast synced: 11/23/2025, 12:07:04 AM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Read Article View on HN