Where do deterministic rules fail for LLM guardrails?

Question

For those running LLMs in production, I’m curious where you’ve seen deterministic rules (regex, allowlists, schema validation, etc.) start to fall apart when used as guardrails. In our experience, rule-based checks are fast, cheap, and predictable, but they struggle with context, intent, and edge cases (e.g. indirect PII leaks, policy violations expressed semantically, or “valid” JSON that’s still wrong).

HackerNews · Accepted Answer

Deterministic rules often break down when dealing with complex, nuanced, or context-dependent scenarios. For instance, detecting indirect PII leaks or policy violations that are expressed semantically rather than through explicit keywords can be challenging. Additionally, rule-based systems may struggle with edge cases, such as 'valid' JSON that is still semantically incorrect. To address these limitations, many teams adopt a hybrid approach, using deterministic rules as a first line of defense and supplementing them with LLM-based semantic checks.

Where do deterministic rules fail for LLM guardrails?

Resources