Are Llms Better Suited for PR Reviews Than Full Codebases?
Key topics
I’ve been thinking about this problem and wanted to share a perspective.
When evaluating LLMs for static analysis, I see four main dimensions: accuracy, coverage, context size, and cost.
On accuracy and coverage, today’s LLMs feel nowhere close to replacing dedicated SAST tools on real-world codebases. They do better on isolated snippets or smaller repos, but once you introduce deep dependency chains, results drop off quickly.
Context size is another bottleneck. Feeding an LLM a repo with millions of lines creates huge problems for reasoning across files, and the runtime gets impractical.
That leads to cost. Running an LLM across a massive codebase can be significantly more expensive than traditional scanners, without obvious ROI.
Where they do shine is at smaller scales — reviewing PRs, surfacing potential issues in context, or even suggesting precise fixes when the input is well-scoped. That seems like the most practical application right now. Whether providers will invest in solving the big scaling problems is still an open question.
Curious how others here think about the trade-offs between LLM-based approaches and existing SAST tools.
Discussion on the suitability of LLMs for code vulnerability detection, with a focus on PR reviews vs full codebases.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
7m
Peak period
3
0-1h
Avg / period
3
Key moments
- 01Story posted
Sep 5, 2025 at 2:33 PM EDT
4 months ago
Step 01 - 02First comment
Sep 5, 2025 at 2:39 PM EDT
7m after posting
Step 02 - 03Peak activity
3 comments in 0-1h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 5, 2025 at 2:57 PM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.