Back to Home11/18/2025, 6:54:21 PM

Show HN: Rhesis – Open-source platform for collaborative LLM application testing

1 points
0 comments

Mood

calm

Sentiment

positive

Category

tech

Key topics

LLM testing

collaborative development

open-source software

Hi HN, I'm Nicolai. I'm working with a small team in Germany on Rhesis, an open-source platform for testing conversational LLM applications and agents. We’re sharing an early community preview today.

Why we built this: We saw teams repeatedly struggle with testing: scattered test cases, unclear or inconsistent metrics, and a lot of manual effort that still missed obvious failures before production. Most tools assume a single developer runs evals alone; in practice, testing tends to involve PMs, domain experts, QA, and engineers. We built Rhesis to make that collaboration straightforward.

What it does: Rhesis is a self-hostable platform (with UI) where teams can create, run, and review tests for conversational AI systems. A few core ideas:

- Test generation: Create and run tests for single-turns or full conversations; the platform can also assist with generating both single- and multi-turn scenarios using your domain context. - Domain context / knowledge: Provide background material to guide test creation so you’re not starting from an empty prompt. - Collaboration tools: Non-technical teammates can write test cases, leave comments, and review results; developers can dig into failures with detailed traces and outputs. - Unified metrics: Bring in eval metrics from DeepEval, RAGAS, and similar OSS frameworks without re-implementing them.

Current state: Still early. We shipped v0.4.2 last week with a zero-config Docker setup. Core flows work, but there are rough edges. Everything is MIT-licensed; an enterprise edition will come later, but the OSS core will remain free. We’re currently focused on conversational applications because that’s where we saw the biggest pain in evaluation and QA workflows.

Links: App: app.rhesis.ai GitHub: github.com/rhesis-ai/rhesis Docs: docs.rhesis.ai

Happy to hear your thoughts and any answer questions about platform design, the architecture, or our thinking on collaborative testing workflows.

The post introduces Rhesis, an open-source platform for collaborative testing of conversational LLM applications, addressing the challenges teams face in testing such systems.

Snapshot generated from the HN discussion

Discussion Activity

No activity data yet

We're still syncing comments from Hacker News.

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (0 comments)

Discussion hasn't started yet.

ID: 45970388Type: storyLast synced: 11/18/2025, 6:56:41 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.