Not

Hacker

News!

Not

Hacker

News!

AI-observed conversations & context

Daily AI-observed summaries, trends, and audience signals pulled from Hacker News so you can see the conversation before it hits your feed.

LiveBeta

Explore

Home
Hiring
Products
Companies
Discussion
Q&A
Privacy Policy

Resources

Visit Hacker News
HN API
Modal cronjobs
Meta Llama

Briefings

Inbox recaps on the loudest debates & under-the-radar launches.

Connect

Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.

Show HN: New eval from SWE-bench team evalutes LMs based on goals not tickets | Not Hacker News!

Not

Hacker

News!

Home
Products
Show HN: New eval from SWE-bench team evalutes LMs based on goals not tickets

Product Launch

anonymous

5 points

1 comments

Posted2 months agoActive2 months ago

Show HN: New eval from SWE-bench team evalutes LMs based on goals not tickets

codeclash.ai

AI evaluationsoftware developmentreinforcement learning

Discussion (1 comments)

Showing 1 comments

jryio

2 months ago

Is competition + limited resources (e.g. Core War) = selection pressures (natural or otherwise).

Can we integrate and bring back reinforcement learning in a framework like this?

View on Hacker News

Not

Hacker

News!

AI-observed conversations & context

Daily AI-observed summaries, trends, and audience signals pulled from Hacker News so you can see the conversation before it hits your feed.

LiveBeta

Explore

Home
Hiring
Products
Companies
Discussion
Q&A

HN ID: 45824582

Mood: calm

Resources

Visit Hacker News
HN API
Modal cronjobs
Meta Llama

Briefings

Inbox recaps on the loudest debates & under-the-radar launches.

Connect

Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.