Not
Hacker
News
!
Home
Hiring
Products
Discussion
Q&A
Users
LLM Benchmarking | Trending Topic on Hacker News | Not Hacker News!
Not
Hacker
News
!
Home
Hiring
Products
Discussion
Q&A
Users
Home
/
Discussion
/
LLM Benchmarking
Back to Discussion
LLM Benchmarking
Loading...
4 stories
•
24h:
0%
•
7d: 0
•
192 comments
Top contributors:
pseudolus
skrid
homanp
logancarmody
Stories
Related Stories
4 stories tagged with llm benchmarking
Study Identifies Weaknesses in How AI Systems Are Evaluated
416
192 comments
by pseudolus
Posted
about 2 months ago
Active
about 1 month ago
AI evaluation
LLM benchmarking
machine learning
Codelens.ai– Community Benchmark Comparing 6 Llms on Real Code Tasks
5
0 comments
by skrid
Posted
3 months ago
Active
about 1 month ago
LLM benchmarking
AI evaluation
software development
Are AI Models Getting Safer Over Time, or Is It Just Bullshit?
1
0 comments
by homanp
Posted
about 2 months ago
Active
about 1 month ago
AI safety
LLM benchmarking
AI development
We Benchmarked Frontier Llms on Defensive Security. the Results Surprised Us
1
0 comments
by logancarmody
Posted
about 2 months ago
Active
about 1 month ago
AI in cybersecurity
LLM benchmarking
security operations