Crimes with Python's Pattern Matching (2022)
Posted5 months agoActive5 months ago
hillelwayne.comTechstoryHigh profile
controversialmixed
Debate
80/100
PythonPattern MatchingProgramming Languages
Key topics
Python
Pattern Matching
Programming Languages
The article discusses the complexities and potential pitfalls of Python's pattern matching feature, sparking a debate among commenters about its usefulness and design.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
45m
Peak period
45
0-6h
Avg / period
12.3
Comment distribution111 data points
Loading chart...
Based on 111 loaded comments
Key moments
- 01Story posted
Aug 21, 2025 at 3:47 PM EDT
5 months ago
Step 01 - 02First comment
Aug 21, 2025 at 4:32 PM EDT
45m after posting
Step 02 - 03Peak activity
45 comments in 0-6h
Hottest window of the conversation
Step 03 - 04Latest activity
Aug 25, 2025 at 3:37 AM EDT
5 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 44977189Type: storyLast synced: 11/20/2025, 6:48:47 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
First, "case foo.bar" is a value match, but "case foo" is a name capture. Python could have defined "case .foo" to mean "look up foo as a variable the normal way" with zero ambiguity, but chose not to.
Second, there's no need to special-case some builtin types as matching whole values. You can write "case float(m): print(m)" and print the float that matched, but you can't write "case MyObject(obj): print(obj)" and print your object. Python could allow "..." or "None" or something in __match_args__ to mean "the whole object", but didn't.
> While potentially useful, it introduces strange-looking new syntax without making the pattern syntax any more expressive. Indeed, named constants can be made to work with the existing rules by converting them to Enum types, or enclosing them in their own namespace (considered by the authors to be one honking great idea)[...] If needed, the leading-dot rule (or a similar variant) could be added back later with no backward-compatibility issues.
second: you can use case MyObject() as obj: print(obj)
Yeah, and I don't buy that for a microsecond.
A leading dot is not "strange" syntax: it mirrors relative imports. There's no workaround because it lets you use variables the same way you use them in any other part of the language. Having to distort your program by adding namespaces that exist only to work around an artificial pattern matching limitation is a bug, not a feature.
Also, it takes a lot of chutzpah for this PEP author to call a leading dot strange when his match/case introduces something that looks lexically like constructor invocation but is anything but.
The "as" thing works with primitive too, so why do we need int(m)? Either get rid of the syntax or make it general. Don't hard-code support for half a dozen stdlib types for some reason and make it impossible for user code to do the equivalent.
The Python pattern matching API is full of most stdlib antipatterns:
* It's irregular: matching prohibits things that the shape of the feature would suggest are possible because the PEP authors couldn't personally see a specific use case for those things. (What's the deal with prohibiting multiple _ but allowing as many __ as you want?)
* It privileges stdlib, as I mentioned above. Language features should not grant the standard library powers it doesn't extend to user code.
* The syntax feels bolted on. I get trying to reduce parser complexity and tool breakage by making pattern matching look like object construction, but it isn't, and the false cognate thing confuses every single person who tries to read a Python program. They could have used := or some other new syntax, but didn't, probably because of the need to build "consensus"
* The whole damn thing should have been an expression, like the if/then/else ternary, not a statement useless outside many lexical contexts in which one might want to make a decision. Why is it a statement? Probably because the PEP author didn't _personally_ have a need to pattern match in expression context.
And look: you can justify any of these technical decisions. You can a way to justify anything you might want to do. The end result, however, is a language facility that feels more cumbersome than it should and is applicable to fewer places than one might think.
Here's how to do it right: https://www.gnu.org/software/emacs/manual/html_node/elisp/pc...
> If needed, the leading-dot rule (or a similar variant) could be added back later with no backward-compatibility issues.
So what, after another decade of debate, consensus, and compromise, we'll end up with a .-prefix-rule but one that works only if the character after the dot is a lowercase letter that isn't a vowel.
PEP: "We decided not to do this because inspection of real-life potential use cases showed that in vast majority of cases destructuring is related to an if condition. Also many of those are grouped in a series of exclusive choices."
I find this philosophical stance off-putting. It's a good thing when users find ways to use your tools in ways you didn't imagine.
PEP: In most other languages pattern matching is represented by an expression, not statement. But making it an expression would be inconsistent with other syntactic choices in Python. All decision making logic is expressed almost exclusively in statements, so we decided to not deviate from this.
We've had conditional expressions for a long time.
What do you mean "prohibiting multiple _"? As in this pattern:
That works fine.There is no reason to have this restriction except that some people as a matter of opinion think unreachable code is bad taste and the language grammar should make bad taste impossible to express. It's often useful to introduce such things as a temporary state during editing. For example,
Why should my temporary match-all be a SyntaxError???? Maybe it's a bug. Maybe my tools should warn me about it. But the language itself shouldn't enforce restrictions rooted in good taste instead of technical necessity.I can, however, write this:
Adding a dummy guard is a ridiculous workaround for a problem that shouldn't exist in the first place.After starting my new job and coming back to Python after many years I was happy to see that they had added `match` to the language. Then I was immediately disappointed as soon as I started using it as I ran into its weird limitations and quirks.
Why did they design it so poorly? The language would be better off without it in its current hamstrung form, as it only adds to the already complex syntax of the language.
> PEP: In most other languages pattern matching is represented by an expression, not statement. But making it an expression would be inconsistent with other syntactic choices in Python. All decision making logic is expressed almost exclusively in statements, so we decided to not deviate from this.
> We've had conditional expressions for a long time.
Also, maybe most other languages represent it as an expression because it's the sane thing to do? Python doing its own thing here isn't the win they think it is.
That said, I don't think OP's antics are a crime. That SyntaxError though, that might be a crime.
And a class-generating callable class would get around Python caching the results of __subclasshook__.
Presumably the reason the parent comment suggested semgrep, not just a grep, is because they're aware that naive substring matching would be wrong.
You could use the playground to check your understanding before implying someone is an idiot.
https://semgrep.dev/playground/new
My best guess is that it adds complexity and makes code harder to read in a goto-style way where you can't reason locally about local things, but it feels like the author has a much more negative view ("crimes", "god no", "dark beating heart", the elmo gif).
This also makes life directly easier for me as a programmer, because I know in what code files I have to look to understand the behavior of that object.
Even linters use it to that purpose, e.g. resolving call sites by looking at the last isinstance() statement to determine the type.
__subclasshook__ puts this at risk by letting a class lie about its instances.
As an example, consider this class:
You can now write code like this: A linter would pass this code without warnings, because it assumes that the if block is only entered if x is in fact an instance of Everything and therefore has the foo() method.But what really happens is that the block is entered for any kind of object, and objects that don't happen to have a foo() method will throw an exception.
It essentially allows the user to check if a class implements an interface, without explicitly inheriting ABC or Protocol. It’s up to the user to ensure the body of the case doesn’t reference any methods or attributes not guaranteed by the subclass hook, but that’s not necessarily bad, just less safe.
All things have a place and time.
Protocols don't need to be explicit superclasses for compile time checks, or for runtime checks if they opt-in with @runtime_checkable, but Protocols are also much newer than __subclass_hook__.
(I love being wrong on HN, always learn something)
> check if a class implements an interface, without explicitly inheriting ABC or Protocol
This really doesn't sound like a feature that belongs in the language. Go do something custom if you really want it.
Some of these examples are similar in effect to what you might do in other languages, where you define an 'interface' and then you check to see if this class follows that interface. For example, you could define an interface DistancePoint which has the fields x and y and a distance() method, and then say "If this object implements this interface, then go ahead and do X".
Other examples, though, are more along the lines of if you implemented an interface but instead of the interface constraints being 'this class has this method' the interface constraints are 'today is Tuesday'. That's an asinine concept, which is what makes this crimes and also hilarious.
I don't find using __subclasshook__ to implement structural subtyping that you can't express with Protocols/ABCs alone to be that much of a crime. You can do evil with it but I can perform evil with any language feature.
Conforming to an interface is a widely accepted concept across many popular languages. __subclasshook__ magic is not. So there is a big difference in violating the principle of least surprise.
That said, I'd be curious to hear a legitimate example of using it to implement "structural subtyping that you can't express with Protocols/ABCs alone".
ABCs with __subclasshook__ have been available since Python 2.6, providing a mechanism to inplement runtime-testable structural subtyping. Protocols and @runtime_checkable, which provide typechecking-time structural subtyping (Protocols) that can also be available at runtime (with @runtime_checkable) were added in Python 3.8, roughly 11 years later.
There may not be much reason to use __subclasshook__ in new code, but there's a pretty good reason it exists.
That's quite a different claim, and makes a lot of sense. Thanks for the history!
I do think there is a ton of indirection going on in the code that I would not immediately think to look for. As the post stated, could be a good reason for this in some things. But it would be the opposite of aiming for boring code, at that point.
https://x.com/brandon_rhodes/status/1360226108399099909
Fun fact: you can do the same thing with the current match/case, except that you have to put your logic in the body of the case so that it's obvious what's happening.
Ruby's `case`/`in` has the same problem.
it doesn't? you simply don't understand what a match statement is.
https://doc.rust-lang.org/book/ch19-03-pattern-syntax.html
notice that x is bound to 4.It's "a DSL contrived to look like Python, and to be used inside of Python, but with very different semantics":
https://discuss.python.org/t/gauging-sentiment-on-pattern-ma...
Notice that the Python doesn't work this way, we didn't make a new variable but instead changed the existing one.
Also, the intent in the Python was a constant, in Rust we'd give this constant an uppercase name by convention, but regardless it's a constant and so of course matching against a constant does what you expect, it can't re-bind a constant, 404 is a constant and so is `const NOT_FOUND: u16 = 404;`
if each x's scope ends at the end of each case doesn't that mean there's only one x?
> we didn't make a new variable but instead changed the existing one.
so because python doesn't have scopes except for function scopes it shouldn't ever have any new features that intersect with scope?
I disagree. Consistently going with the "bad" choice (in this case, leaking the variable to the outer scope) is better inconsistently swinging between 2 ways of doing things. Least astonishment!
C++ has struggled with this, so that paper authors sometimes plead with the committee not to make their proposal needlessly worse in the name of "consistency" with existing bad features. This famously failed for std::span, which thus managed to be not only a real world footgun in a language which already has plenty of footguns but also a PR footgun - because for "consistency" the committee removed the safety from the safety feature and I believe in C++ 26 they will repair this so it's just pointless rather than actively worse...
What Python needs is what Elixir has. A "pin" operator that forces the variable to be used as its value for matching, rather than destructuring.
Almost every version they break existing code. This is why it's common for apps written in Python to depend on specific Python versions instead of just "anything above 3.x".
By major version I meant minor version, 3.13 -> 3.14 is a minor version in Python, but a major source of breaking changes, that is what I meant. There will be no Python 4
https://www.inspiredpython.com/article/watch-out-for-mutable...
However, using the literal syntax does seem to be more efficient. So that is an argument for having dedicated syntax for an empty set.
Factory functions like list/tuple/set are function calls and are executed and avoid this problem. Hence why professional python devs default to `None` and check for that and _then_ initialise the list internally in the function body.
Adding {/} as empty set is great, sure; but that again is just another reified instance and the opposite of set() the function.
Yes, having a solution for this makes sense, but the proposed solutions are just not good. Sometimes one has to admit that not everything can be solved gracefully and just stop, hunting the whale.
It's reasonably type-safe, and there's no need to "close" your chain - every outputted value as you write the chain can have a primitive type.
It shines in notebooks and live coding, where you might want to type stream-of-thought in the same order of operations that you want to take place. Need to log where something might be going wrong? Tee it like you're on a command line!Idiomatic? Absolutely not. Something to push to production? Not unless you like being stabbed with pitchforks. Actually useful for prototyping? 1000%.
In practice, rshift gives a lot more flexibility! And you’d rarely chain after a numeric value.
It's a sign of the design quality of a programming language when 2 arbitrary features A and B of that language can be combined and the combination will not explode in your face. In python and C++ (and plenty of other languages) you constantly have the risk that 2 features don't combine. Both python and C++ are full of examples where you will learn the hard way: "ah yes, this doesn't work." Or "wow, this is really unexpected".
It's usually a good idea for operators to have a specific meaning and not randomly change that meaning depending on the context. If you want to add new operators with new meanings, that's fine. Haskell does that. The downside is people find it really tempting and you end up with a gazillian difficult-to-google custom operators that you have to learn.
Looks a lot like function composition with the arguments flipped, which in Haskell is `>>>`. Neat!
But since you’re writing imperative code and binding the result to a variable, you could also compare to `>>=`.
(https://downloads.haskell.org/~ghc/7.6.2/docs/html/libraries...)
I think there's a big gap pedagogically here. Once a person understands functional programming, these kinds of composition shorthands make for very straightforward and intuitive code.
But, if you're just understanding basic Haskell/Clojure syntax, or stuck in the rabbit hole of "monad is a monoid" style introductions, a beginner could easily start to think: "This functional stuff is really making me need to think in reverse, to need to know my full pipeline before I type a single character, and even perhaps to need to write Lisp-like (g (f x)) style constructs that are quite the opposite of the way my data is flowing."
I'm quite partial to tutorials like Railway Oriented Programming [1] which start from a practical imperative problem, embrace the idea that data and code should feel like they flow in the same direction, and gradually guide the reader to understanding the power of the functional tools they can bring to bear!
If anything, I hope this hack sparks good conversations :)
[0] https://github.com/tc39/proposal-pipeline-operator/issues/91 - 6 years and 793 comments!
[1] https://fsharpforfunandprofit.com/rop/
I wanted to wash my eyes the first time I saw it.
I'll argue that code is in fact not easy to read if reading it doesn't tell you what type an item is and what a given line of code using it even does at runtime.
Crimes with Python's pattern matching
406 points on Aug 2, 2022. 120 comments
https://news.ycombinator.com/item?id=32314368
An example that uses a similar approach would be if you had a metaclass that overrode __instancecheck__ to return true if a string matched a regular expression. Then you could create (dynamically defined) classes that used that metaclass to use in match statements to march a string against multiple regexes. Unfortunately, I can't think of a good way to extract any capture groups.
Ive been hobbying in python for a decade plus at this point and it still ties my brain in knots sometimes with how it works at runtime. I do enjoy working with it though.
E: just to clarify if you're going to use this don't bury it deep in a multi file code structure, explicitly have it in the same file as you use it, otherwise people will get confused.
2 more comments available on Hacker News