Python Has Had Async for 10 Years – Why Isn't It More Popular?
Posted4 months agoActive4 months ago
tonybaloney.github.ioTechstoryHigh profile
heatednegative
Debate
85/100
PythonAsyncioConcurrencyProgramming
Key topics
Python
Asyncio
Concurrency
Programming
The article discusses why Python's async feature, introduced 10 years ago, hasn't gained more popularity, sparking a heated debate among commenters about its usability, design, and relevance.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
7m
Peak period
113
0-6h
Avg / period
14.5
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Sep 2, 2025 at 1:24 PM EDT
4 months ago
Step 01 - 02First comment
Sep 2, 2025 at 1:32 PM EDT
7m after posting
Step 02 - 03Peak activity
113 comments in 0-6h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 5, 2025 at 2:08 AM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45106189Type: storyLast synced: 11/20/2025, 8:23:06 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
like guaranteeing a close() inside a finally
asyncio is a terrible, terrible library
Eventually I wrote an "image sorter" that I found was hanging up when the browser was trying to download images in parallel, the image serving should not have been CPU bound, I was even using sendfile(), but I think other requests would hold up the CPU and would be block the tiny amount of CPU needed to set up that sendfile.
So I switched from aiohttp to the flask API and serve with either Flask or Gunicorn, I even front it with Microsoft IIS or nginx to handle the images so Python doesn't have to. It is a minor hassle because I develop on Windows so I have to run Gunicorn inside WSL2 but it works great and I don't have to think about server performance anymore.
My take on gunicorn is that it doesn't need any tuning or care to handle anything up to the large workgroup size other than maybe "buy some more RAM" -- and now if I want to do some inference in the server or use pandas to generate a report I can do it.
If I had to go bigger I probably wouldn't be using Python in the server and would have to face up to either dual language or doing the ML work in a different way. I'm a little intimidated about being on the public web in 2025 though with all the bad webcrawlers. Young 'uns just never learned everything that webcrawler authors knew in 1999. In 2010 there were just two bad Chinese webcrawlers that never sent a lick of traffic to anglophone sites, but now there are new bad webcrawlers every day it seems.
Async is for juggling lots of little initialisations, completions, and coordinating work.
Many apps are best single threaded with a thread pool to run (single threaded) long running tasks.
1) Use the network thread pool to also run application code. Then your entire program has to be super careful to not block or do CPU intensive work. This is efficient but leads to difficult to maintain programs.
2) The network thread pool passes work back and forth between an application executor. That way, the network thread pool is never starved by the application, since it is essentially two different work queues. This works great, but now every request performs multiple thread hops, which increases latency.
There has been a lot of interest lately to combine scheduling and work stealing algorithms to create a best of both worlds executor.
You could imagine, theoretically, an executor that auto-scales, and maintains different work queues and tries to avoid thread hops when possible. But ensures there are always threads available for the network.
Writing a FastAPI websocket that reads from a redis pubsub is a documentation-less flailfest
I personally blame low async adoption in Python on 1) general reduction in its popularity vs Typescript+node, which is driven by the desire to have a single stack on the frontend and backend, not by bad or good async implementations in Python (see also: Rails, once the poster child of the Web, now nearly forgotten) 2) lack of good async stdlib. parallelism and concurrency are distant thirds.
async is a concurrency mechanism.
That is, if you use external stuff and can delegate work to them, then async is concurrent (async io for instance)
But if you do not, then async is regular code with extra steps
My understanding is that JS can't do that (besides service workers which are non-shared memory), but it still has multiple concurrent code-blocks being executed at the same time, just in linear fashion. It will just never use multiple CPU cores at the same time (unless calling some non-JS non-shared-memory code)
Not worth the trouble. Shell pipelines are way easier to use. Or simply waiting —no pun intended— for the synchronous to finish.
Use a tuple, maybe walrus, and return the last item[-1].
How do I get variables for not redoing long-running computations that depend on one-another? So, what if the third tuple value depends on the second and the second in turn depends on the first?
For any new app that is mostly IO constraint I'd still encourage the use of asyncio from the beginning.
It’s clear that Dr. Frankenstein has been at large and managed to get his hands on Python’s corpse.
The default linter in Vs Code keeps marking those functions with warnings though. Says I should mark them as async
"There should be one-- and preferably only one --obvious way to do it : Aim for a single, clear solution to a problem. "
Snide remark aside, I actually like the Zen of Python as programming language folklore but in 2025 AD it's kinda crazy to pretend that Python actually adheres to those tenets or whatever you wish to call them, and I'd go as far as to claim that it does a disservice to a language flexible enough for a lot of use cases. There's even someone on YouTube developing a VR game with Python.
The truth is that in python, async was too little, too late. By the time it was introduced, most people who actually needed to do lots of io concurrently had their own workarounds (forking, etc) and people who didn't actually need it had found out how to get by without it (multiprocessing etc).
Meanwhile, go showed us what good green threads can look like. Then java did it too. Meanwhile, js had better async support the whole time. But all it did was show us that async code just plain sucks compared to green thread code that can just block, instead of having to do the async dances.
So, why engage with it when you already had good solutions?
I take so much flak for this opinion at work, but I agree with you 100%.
Code that looks synchronous, but is really async, has funny failure modes and idiosyncracies, and I generally see more bugs in the async parts of our code at work.
Maybe I’m just old, but I don’t think it’s worth it. Syntactic sugar over continuations/closures basically..
But this appears to be describing languages with green threads, rather than languages that make async explicit.
You may think of use of an async keyword as explicit async code but that is very much not the case.
If you want to see async code without the keyword, most of the code of Linux is asynchronous.
Let me try to clarify my point of view:
I don’t mean that async/await is more or less explicit than goroutines. I mean regular threaded code is more explicit than async/await code, and I prefer that.
I see colleagues struggle to correctly analyze resource usage for instance. Someone tries to parallelize some code (perhaps naiively) by converting it to async/await and then run out of memory.
Again, I don’t mean to judge anyone. I just observe that the async/await-flavor has more bugs in the code bases I work on.
More explicit in what sense? I've written both regular threaded Python and async/await Python. Only the latter shows me precisely where the context switches occur.
Kernel-style async code, where everything is explicit:
* You write a poller that opens up queues and reads structs representing work
* Your functions are not tagged as "async" but they do not block
* When those functions finish, you explicitly put that struct in another queue based on the result
Async-await code, where the runtime is implicit:
* All async functions are marked and you await them if they might block
* A runtime of some sort handles queueing and runnability
Green threads, where all asynchrony is implicit:
* Functions are functions and can block
* A runtime wraps everything that can block to switch to other local work before yielding back to the kernel
which are no different from app POV from kernel threads, or any threads for that matter.
the whole async stuff came up because context switch per event is way more expensive than just shoveling down a page of file descriptor state.
thus poll, kqueue, epoll, io_uring, whatever.
think of it as batch processing
Everything is in a run loop that does not exist in my codebase.
The context switching points are obvious but the execution environment is opaque.
At least that's how it looks to me.
If even this does not help, rm -rf is your friend.
Green threads are better (IMHO), because they actually do hide all the machinery. As a developer in a language with mature green threads (Erlang), I don't have to know about the machinery[1], I just write code that blocks from my perspective and BEAM makes magic happen. As I understand it, that's the model for Java's Project Loom aka Java Green Threads 2: 2 Green 2 Threads. The first release had some issues with the machinery, but I think I read the second release was much better, and I haven't seen much since... I'm not a Cafe Babe, so I don't follow Java that closely.
[1] It's always nice to know about the machinery, but I don't have to know about it, and I was able to get started pretty quick and figure out the machinery later.
I would say that green threads still have "function coloring stuff", we just decided that every function will be async-colored.
Now, what happens if you try to cross an FFI-border and try to call a function that knows nothing about your green-thread runtime is an entirely different story...
Thank you for explaining much more clearly than I could.
> none of the function coloring stuff
And it’s this part that I don’t like (and see colleagues struggling to implement correctly at work).
The comment you are responding to prefers green threads to be managed like goroutines, where the code looks synchronous, but really it's cooperative multitasking managed by the runtime, to explicit async/await.
But then you criticize "code that looks synchronous but is really async". So you prefer the explicit "async" keywords? What exactly is your preferred model here?
Goroutines feel like old-school, threaded code to me. I spawn a goroutine and interact with other “threads” through well defined IPC. I can’t tell if I’m spawning a green thread or a “real” system thread.
C#’s async/await is different IMO and I prefer the other model. I think the async-concept gets overused (at my workplace at least).
If you know Haskell, I would compare it to overuse of laziness, when strictness would likely use fewer resources and be much easier to reason about. I see many of the same problems/bugs with async/await..
An obvious advantage of doing it that way is you don’t need any runtime/OS-level support. Eg your runtime doesn’t need to even have a concept of threads. It works on bare metal embedded.
Another advantage is that it’s fully cooperative model. No magic preemption. You control the points where the switch can happen, there is no magic stuff suddenly running in background and messing up the state.
It is nothing like what you just described
Problem is it that it self reinforces and before you look every little function is suddenly async.
The irony is that it is used where you want to write in a synchronous style...
Async in C# is awesome, and there's nothing stopping you from writing sync code where appropriate or using threads if you want proper multi threading. Async is primarily used to avoid blocking for non-cpu-bound work, like waiting for API/db/filesystem etc. If you use it everywhere then it's used everywhere, if you don't then it isn't. For a lot of apps it makes sense to use it a lot, like in web apis that do lots of db calls and such. This incurs some overhead but it has the benefit of avoiding blocked threads so that no threads sit idle waiting for I/O.
You can imagine in a web API receiving a large number of requests per second there's a lot of this waiting going on and if threads were idle waiting for responses you wouldn't be able to handle nearly as much throughout.
Used to be well-hidden cooperative, these days it's preemptive.
To be fair that also happens with other solutions.
A lot of the async problems in other languages is because they haven't bought up into the concept fully with some 3rd party code using it and some don't. JS went all-in with async.
[1]: Yes I know about service workers, but they are not threads in the sense that there is no shared memory*. It is good for some types of parallelization problems, but not others because of all the memory copying required.
[2]: Yes I know about SharedArrayBuffer and there is a bunch of proposals to add support for locks and all that fun stuff to them, which also brings all the complexity back.
https://nodejs.org/en/learn/asynchronous-work/overview-of-bl...
I will agree with what some is said a above, BEAM is pretty great. I have been using it recently through Elixir.
DESPITE THAT: even if you're doing everything "right" (TM) -- using a single thread and doing all your networking I/O sequentially is simply slow as hell. A very very good example of this is bottle.py. Lets say you host a static web server with bottle.py. Every single web request for files leads to sequential loading, which makes page load times absolutely laughable. This isn't the case for every python web frame work, but it seems to be a common theme to me. (Cause: single thread, event loop.)
With asyncio, the most consistent behavior I've had with it seems to be to avoid having multiple processes and then running event loops inside them. Even though this approach seems like its necessary (or at least threading) to avoid the massive down sides of the event loop. But yeah, you have to keep everything simple. In my own library I use a single event loop and don't do anything fancy. I've learned the hard way how asyncio punishes trying to improve it. It's a damn cool piece of software, just has some huge limitations for performance.
People act like it's dead but it still works perfectly well and, at least for me, makes async networking so much simpler.
[1] https://openjdk.org/jeps/491
The memory and execution model for higher level work needs to not have async. Go is the canonical example of it done well from the user standpoint IMO.
Actually, I was and am primarily a Dart developer, not a JS developer. But function color is a problem in any language that uses that style of asynchrony: JS, Dart, etc.
However, gevent has to do its magic by monkeypatching. Wanting to avoid that, IIRC, was a significant reason why the async/await syntax and the underlying runtime implementation was developed for Python.
Another significant reason, of course, was wanting to make async functions look more like sync functions, instead of having to be written very differently from the ground up. Unfortunately, requiring the "async" keyword for any async function seriously detracted from that goal.
To me, async functions should have worked like generator functions: when generators were introduced into Python, you didn't have to write "gen def" or something like it instead of just "def" to declare one. If the function had the "yield" keyword in it, it was a generator. Similarly, if a function has the "await" keyword in it, it should just automatically be an async function, without having to use "async def" to declare it.
Similarly, a function that calls an async function wouldn't itself be async unless it also had the await keyword. But of course the usual way of calling an async function would be to await it. And calling it without awaiting it wouldn't return a value, just as with a generator; calling a generator function without yielding from it returns a generator object, and calling an async function without awaiting it would return a future object. You could then await the future later, or pass it to some other function that awaited it.
Doing async in python has the same fundamental design. You have an executer, a scheduler, and event-driven wakers on futures or promises. But you’re doing it in a fundamentally hand-cuffed environment.
You don’t get benefits like static compilation, real work-stealing, a large library ecosystem, or crazy performance boosts. Except in certain places in the stack.
Using fastapi with async is a game-changer. Writing a cli to download a bunch of stuff in parallel is great.
But if you want to use async to parse faster or make a parallel-friendly GUI, you are more than likely wasting your time using python. The benefits will be bottlenecked by other language design features. Still the GIL mostly.
I guess there is no reason you can’t make tokio in python with multiprocessing or subinterpreters, but to my knowledge that hasn’t been done.
Learning tokio was way more fun, too.
I am happy to hear stories of using pypy or something to radically improve an architecture. I don’t have any from personal experience.
I guess twisted and stackless, a long time ago.
Is this no longer true?
Taking a general case, let's say a forum, in order to render a thread one needs to search for all posts from that thread, then get all the extra data needed for rendering and finally send the rendered output to the client.
In the "regular" way of doing this, one will compose a query, that will filter things out, join all the required data bla bla, send it to the database, wait for the answer from the database and all the data to be transferred over, loop over the results and do some rendering and send the thing over to the client.
It doesn't matter how async your app code is, in this way of doing things, the bottle neck is the database, as there is a fixed limit on how many things a db server can do at once and if doing one of these things takes a long time, you still end up waiting too much.
In order for async to work, one needs to split the work load into very small chunks that can be done in parallel and very fast, therefore, sending a big query and waiting for all the result data is out of the window.
An async approach would split the db query into a search query, that returns a list of object ids, say posts, then create N number of async tasks that given a post id will return a rendered result. These tasks will do their own query to retrieve the post data, then assemble another list of async tasks to get all the other data required and render each chunk and so on. Throw in a bunch of db replicas and you get the benefits of async.
This approach is not generally used, because, let's face it, we like making the systems we use do complicated things, eg complicated sql requests.
However, async tasks on a single core means potentially a lot of switching between those tasks. So async alone does not save the day here. It will have to be combined with true parallelism, to result in the speedup we want. Otherwise a single task rendering all the parts in sequence would be faster.
Also not, that it depends on where your db is. the process you describe implies at least 2 rounds of db communication. First one for the initial get forum thread query, then second one for all the async get forum replies requests. So if communication with the db takes a long time, you might as well lose what you gained, because you did 2 rounds of that communication.
So I guess it's not a trivial matter.
The problem is not python, it's a skill issue.
First of all forking is not a workaround, it's the way multiprocessing works at the low level in Unix systems.
Second of all, forking is multiprocessing, not multithreading.
Third of all, there's the standard threading library which just works well. There's no issue here, you don't need async.
What I did have issues with though, was async. For example pytest's async thingy is buggy for years with no fix in sight, so in one project I had to switch to manually making an event loop in that those tests.
But isn't the whole purpose of async, that it enabled concurrency, not parallelism, without the weight of a thread? I agree that in most cases it is not necessary to go there, but I can imagine systems with not so many resources, that benefit from such an approach when they do lots of io.
Promises/thenables gave people the time to get used to the idea of deferred evaluation via a familiar callback approach... Then when async/await came along, people didn't see it as a radically new feature but more as syntactic sugar to do what they were already doing in a more succinct way without callbacks.
People in the Node.js community were very aware of async concepts since the beginning and put a lot of effort in not blocking the event loop. So Promises and then async/await were seen as solutions to existing pain points which everyone was already familiar with. A lot of people refactored their existing code to async/await.
I'll be sold on this when a green thread native UI paradigm becomes popular but it seems like all the languages with good native UI stories have async support.
To me, Go is really well designed when it comes to multithreading because it is built upon a mutual contract where it will break easily and at compile time when you mess up the contract between the scheduling thread and the sub threads.
But, for the love of Go, I have no idea who the person was that decided that the map data type has to be not threadsafe. Once you start scaling / rewriting your code to use multiple goroutines, it's like you're being thrown in the cold water without having learnt to swim before.
Mutexes are a real pain to use in Go, and they could have been avoided if the language just decided to make read/write access threadsafe for at least maps that are known to be accessed from different threads.
I get the performance aspect of that decision, but man, this is so painful because you always have to rewrite large parts of your data structures everywhere, and abstract the former maps away into a struct type that manages the mutexes, which in return feels so dirty and unclean as a provided solution.
For production systems I just use haxmap from the start, because I know its limitations (of hashes of keys due to atomics), because that is way easier to handle than forgetting about mutexes somewhere down the codebase when you are still before the optimization phase of development.
I'd add one other aspect that we sort of take for granted these days, but affordable multi-threaded CPUs have really taken off in the last 10 years.
Not only does the stack based on green-threads "just work" without coloring your codebase with async/no-async, it allows you to scale a single compute instance gracefully to 1 instance with N vCPUs vs N pods of 2-vCPU instances.
The main difference being that now both models are simultaneously supported instead of being an implementation detail of each JVM.
python is kind of a slow choice for that sort of thing regardless and i don't think the complexity of async is all that justified for most usecases.
i still maintain my position that a good computer system should let you write logic synchronously and the system will figure out how to do things concurrently with high performance. (although getting this right would be very hard!)
Even then, nginx might be a netter solution.
In JavaScript async doesn’t have a good way to nice your tasks, which is an important feature of green threads. Sindre Sorhus has a bunch of libraries that get close, but there’s still a hole.
What coroutines can do is optimize the instruction cache. But I’m not sure goroutines entirely accomplish that. There’s nothing preventing them from doing so but implementation details.
Generations of programmers have given up on downloading data async in their Python scripts and just gone to bash and added a & at the end of a curl call inside a loop.
C# has a dictator with a budget: Microsoft integrated async into C# in a formal way, with 5.0, including standard libs, debugging, docs, samples, clear guidance going forward, etc. What holes there were were dealt with in an orderly and timely manner.
JavaScript actually had a pretty messy start with async, with divergent conventions and techniques. Ultimately this got smoothed out with language additions, but it wasn't all that wonderful in the early days. Also, JavaScript started from a simpler place (single-threaded event loop) that never had "fork" and threads and all that comes with those, so there was less legacy to accommodate and fewer problems to overcome.
Python had a vast base of existing non-async software chock full of blocking code, plus an incomplete and haphazard concurrency evolution. There are several legacy concurrency solutions in Python, most still in use today. Python async is still competing and conflicting with it all. Not unlike the Python 2->3 transition.
The traditional argument against the above assertion has been that asyncio is good for I/O work, not for CPU work, but this constraint is not realistic because CPU usage is guaranteed to creep in.
In summary, I can use threading/process/interpreter pools and concurrent futures, considering I need them anyway, without really needing to introduce yet another unnecessary concurrency paradigm (of asyncio).
The vast majority of the Python code I wrote in the last 5-6 years uses asyncio, and most of the complaints I see about it (hard to debug, getting stuck, etc.) were -- at least in my case -- because there were some other libraries doing unexpected things (like threading or hard sleep()).
Coming from a networking background, the way I can deal with I/O has been massively simplified, and coroutines are quite useful.
But as always in HN, I'm prepared for that to be an unpopular opinion.
asyncio is easier than threads or multiprocess: less locking issue, easier to run small chunks of code in // (easier to await something than to create a thread that run some method)
One of the most memorable "real software engineering" bugs of my career involved async Python. I was maintaining a FastAPI server which was consistently leaking file descriptors when making any outgoing HTTP requests due to failing to close the socket. This manifested in a few ways: once the server ran out of available file descriptors, it degraded to a bizarre world where it would accept new HTTP requests but then refuse to transmit any information, which was also exciting due to increasing the difficulty of remotely debugging this. Occasionally the server would run out of memory before running out of file descriptors on the OS, which was a fun red herring that resulted in at least one premature "I fixed the problem!" RAM bump.
The exact culprit was never found - I spent a full week debugging it, and concluded that the problem had to do with someone on the library/framework/system stack of FastAPI/aiohttp/asyncio having expectations about someone else in the stack closing the socket after picking up the async context, but that never actually occurring. It was impenetrable to me due to the constant context switching between the libraries and frameworks, such that I could not keep the thread of who (above my application layer) should have been closing it.
My solution was to monkey patch the native python socket class and add a FastAPI middleware layer so that anytime an outgoing socket opened, I'd add it to a map of sockets by incoming request ID. Then when the incoming request concluded I'd lookup sockets in the map and close them manually.
It worked, the servers were stable, and the only follow-up request was to please delete the annoying "Socket with file descriptor <x> manually closed" message from the logs, because they were cluttering things up. And thus, another brick in the wall of my opinion that I do not prefer Python for reliable, high-performance HTTP servers.
This point doesn't get enough coverage. When I saw async coming into Python and C# (the two ecosystems I was watching most closely at the time) I found it depressing just how much work was going into it that could have been productively expended elsewhere if they'd have gone with blocking calls to green threads instead.
To add insult to injury, when implementing async it seems inevitable that what's created is a bizarro-world API that mostly-mirrors-but-often-not-quite the synchronous API. The differences usually don't matter, until they do.
So not only does the project pay the cost of maintaining two APIs, the users keep paying the cost of dealing with subtle differences between them that'll probably never go away.
> I do not prefer Python for reliable, high-performance HTTP servers
I don't use it much anymore, but Twisted Matrix was (is?) great at this. Felt like a superpower to, in the oughties, easily saturate a network interface with useful work in Python.
You must be an experienced developer to write maintenable code with Twisted, otherwise, when the codebase increase a little, it will quickly become a bunch of spaghetti code.
greenlet which is sort of minimal stackless .. before 2008
pycoev which is on one hand greenlets without memmove()s, on the other hand sort of io-scheduled m:n threading I wrote myself in 2009.
so, at least idk, 20 years?
It was first needed. Then 10 years passed, people got around to pushing it through the process aaand by the time it was done it was already not needed. so it all stalled. Same with Rust.
Nowadays server-side async is handled very differently. And client-side is dominated by that abomination called JS.
Reminds me of how long it took some to go from Python 2 to Python 3.
Asyncio means learning different syntax that buys me nothing over the existing tools. Why would I bother?
I'm personally halfway through that journey (having spent like 4h reading docs/learning, on top of the development). I suspect it could have been designed in such a way so that it's less trivially easy to mess up.
135 more comments available on Hacker News