The Future of Python Web Services Looks Gil-Free

Posted3 months agoActive2 months ago

gi0baro-dev

203 points

97 comments

blog.baro.devTechstoryHigh profile

excitedpositive

Debate

60/100

PythonGilFree-ThreadingWeb Services

Key topics

Python

Gil

Free-Threading

Web Services

The article discusses the potential of Python 3.14's free-threading capabilities to improve web service performance, and the community is excitedly discussing the implications and potential challenges.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

Peak period

Day 7

Avg / period

19.4

Comment distribution97 data points

Loading chart...

Based on 97 loaded comments

Key moments

01Story posted
Oct 19, 2025 at 6:38 AM EDT
3 months ago
Step 01
02First comment
Oct 25, 2025 at 8:46 AM EDT
6d after posting
Step 02
03Peak activity
82 comments in Day 7
Hottest window of the conversation
Step 03
04Latest activity
Oct 29, 2025 at 8:07 AM EDT
2 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (97 comments)

Showing 97 comments

Waterluvian

2 months ago

2 replies

We already have PyPy and PyPI, so I think we are cosmically required to call Python 3.14 PyPi

coldtea

2 months ago

1 reply

PeePee hee hee

mythrwy

2 months ago

You are getting kicked out of the donglegate conference now.

jyscao

2 months ago

PiPy is more apt, no?

mkoubaa

2 months ago

3 replies

I thought WSGI already used subinterpreters that each have their own GIL

theandrewbailey

2 months ago

Subinterpreters are part of the Python standard library as of 3.13 (I think?).

ZiiS

2 months ago

This needs a lot of RAM with the speed/cores of modern CPUs. A GIL free multi threaded ASGI scales much further.

pas

2 months ago

WSGI is just the protocol for web request handling (like CGI, fast CGI), some implementations utilized subinterpreter support through the C API (which existed in its basic form since Python 1.5 according to PEP 554 or since 2.2 according to the current docs)

but before 3.12 the isolation was not great (and still there are basic process-level things that cannot be non-shared per interpreter)

https://docs.python.org/3/library/concurrent.interpreters.ht...

Zsfe510asG

2 months ago

8 replies

Accessing a global object as the most simple benchmark that in fact exercises the locks still shows a massive slowdown that is not offset by the moderate general speedups since 3.9:

  x = 0

  def f():
      global x
      for i in range(100000000):
          x += i

  f()
  print(x)

Results:

  3.9:        7.1s

  3.11:       5.9s

  3.14:       6.5s

  3.14-nogil: 8.4s

That is a NOGIL slowdown of 18% compared to 3.9, 44% compared to 3.11 and 30% compared to 3.14. These numbers are in line with previous attempts at GIL removal that were rejected because they didn't come from Facebook.

Please do not complain about the global object. Using a pure function would obviously be a useless benchmark for locking and real world Python code bases have far more intricate access patterns.

logicchains

2 months ago

1 reply

>Please do not complain about the global object. Using a pure function would obviously be a useless benchmark for locking and real world Python code bases have far more intricate access patterns.

Just because there's a lot of shit Python code out there, doesn't mean people who want to write clean, performant Python code should suffer for it.

nine_k

2 months ago

Those who's going to suffer are the people who inherited a ton of legacy Python code written in this style.

mynewaccount00

2 months ago

1 reply

you realize this is not a concern because nobody in the last 20 years uses global?

2 months ago

2 replies

An incrementing global counter is a pretty common scenario if your goal is to have guaranteed-unique IDs assigned to objects within a process, especially if you want them to be sequential too. I've got counters like that in various parts of code I've shipped, typically incremented using atomics.

sillythrowaway7

2 months ago

1 reply

It is a pretty common scenario if you don’t follow almost two decades worth of best practices and suggested alternatives and advice

immibis

2 months ago

1 reply

Can you show us some of this advice?

mynewaccount00

2 months ago

1 reply

... Not using globals? Pass the value around, use recursion, and keep state as local as possible.

immibis

2 months ago

I'd like you to show:

How to write Python code with no globals

And a source other than you saying you should do it this way

pphysch

2 months ago

Is that global counter really the only dependency? No database connections or anything that would warrant some proper dependency injection?

adamzwasserman

2 months ago

2 replies

I will complain about the global object.

Even though technically, everything in Python is an object, I feel strongly that programmers should avoid OOP in Python like the plague. Every object is a petri dish for state corruption.

Thee is a very solid list of reasons to use pure functions with explicit passing wherever humanly possible, and I personally believe there is no comparable list of reason to use OOP. * Stack-allocated primitives need no refcounting * Immutable structures reduce synchronization * Data locality improves when you pass arrays/structs rather than object graphs * Pure functions can be parallelized without locks

nine_k

2 months ago

1 reply

I'm 100% on board regarding the use of pure functions and eschewing shared mutable state wherever practical. (Local mutable state is fine.)

But OOP does not necessitate mutable state; you can do OOP with immutable objects and pure methods (except the constructor). Objects are collections of partially-applied functions (which is explicit in Python) that also conceal internal details within its own namespace. It is convenient in certain cases.

adamzwasserman

2 months ago

I agree. I sometimes do that.

But very rarely.

int_19h

2 months ago

Python doesn't even have frozendicts out of the box.

If you want immutability as a lynchpin, you need to look at a different language. This one is very much not designed for it.

wenc

2 months ago

4 replies

Globals are not commonly used except in retrofit cases of legacy code or in beginner code.

I haven’t seen or used a global more than once in my 20 years of writing Python.

nilamo

2 months ago

1 reply

Flask's request context is a global object. And since this is specifically about web services, flask seems highly relevant to the conversation.

treyd

2 months ago

1 reply

That's correct. Flask has a global request context object, so by design it can only safely handle a single request at a time per Python interpreter. If you want to parallelize multiple Flask servers, you spin up multiple interpreters.

Web services in Python that want to handle multiple comcurrent requests in the same interpreter should be using a web framework that is designed around that expectation and don't use a global request context object, such as FastAPI.

formerly_proven

2 months ago

1 reply

No, Flask has global objects wrapping state objects in contextvars.

nine_k

2 months ago

1 reply

This is even more funny, because now you need to switch the global reference according to the context. With GIL, it is easy; without it,...

formerly_proven

2 months ago

You misunderstand. The "request" or "g" objects in Flask are proxies which access the actual objects through contextvars, which are effectively thread-local storage with some extra sugar. The context stack of a contextvar is already within the TLS and therefore always bound to a specific thread.

bgwalter

2 months ago

2 replies

You can modify the posted script to eliminate global variables:

  def f():
      l = [0]
      for i in range(100000000):
          l += [2]
      return l

  x = f()
  print(x[0])

Timings:

  3.9:                    8.6s

  3.14:                   8.7s

  3.14 (free-threading): 11.6s

3.14 is not faster than 3.9 and the free-threading build is 33% slower.

nas

2 months ago

1 reply

I get:

3.9: 2.78

3.14: 3.86

3.14t: 3.91

This is a silly benchmark though. Look at pyperformance if you want something that might represent real script/application performance. Generally 3.14t is about 0.9x the performance of the default build. That depends on a lot of things though.

bgwalter

2 months ago

Pyperformance is a bloated, over-engineered test suite that does not catch anything. Due to the bloat it is hard to see what is even measured.

This benchmark demonstrates that global variables are not needed to find severe regressions.

ciupicri

2 months ago

2 replies

What is this $hi+? You're creating a list that looks like this: [0, 2, …, 2]. It's also common / Pythonic to use uppercase L for lists.

If you don't want to use global variables just add the result of f to x and stop using the global variable, i.e.

   x += f()

bgwalter

2 months ago

Not even wrong.

int_19h

2 months ago

> It's also common / Pythonic to use uppercase L for lists.

Variables always start with a lowercase letter in idiomatic Python unless they're constants or types.

Using single-letter uppercase for variables is not unusual in ML Python code, but that also happens to be one of the worst ecosystems when it comes to idiomatic Python and general code quality.

int_19h

2 months ago

Do you mean to say that mutating globals is not commonly used?

Because literally every import, class definition, or function definition that you make at top-level is a global.

Now some people do in fact do all those things inside a function, too, and then call that function as the only thing that actually happens globally. And I've done such hacks myself to squeeze the last few % of perf out of CPython on the very rare occasions where you need to do that but dropping into C is not an option. But that's certainly not idiomatic Python.

Groxx

2 months ago

I've nearly seen the opposite - tons of libraries and tons of user code that uses globals either directly or transitively.

I have not been seeing good Python code, for sure. Hopefully it's not a majority. But it's very far from non-existent.

immibis

2 months ago

1 reply

> Using a pure function would obviously be a useless benchmark for locking

But isn't that the point? Previously, pure functions would all lock, so you could only run one pure function at a time. Now they don't.

byroot

2 months ago

I haven’t seriously used Python in over 15 years, but I assume the comparison is against using a preforking server with 1+ process per core.

The question is whether 1+ thread per core with GIL free Python perform as well as 1+ process per core with GIL.

My understanding is that this global is just a way to demonstrate that the finely grained locking in the GIL free version may make it so that preforking servers may still be more performant.

greatgib

2 months ago

Sad that you did not test with 3.12 that is a lot better then 3.11!

NeutralForest

2 months ago

As mentioned in the article, others might have different constraints that make the GIL worth it for them; since both versions of Python are available anyways, it's a win in my book.

josefx

2 months ago

If you are accessing a global in a loop you might want to assign it to a local variable first. From what I remember that should result in a speedup in all python versions.

natdempk

2 months ago

2 replies

Really great, just waiting on library support / builds for free threading.

Have people had any/good experiences running Granian in prod?

alex_hirner

2 months ago

1 reply

It was and is a life saver. Our django app suffered from runaway memory leaks (quite a story). We were not able to track down the root cause exactly. There are numerous, similar issues with uvicorn or other webservers. Granian contained these problems. Multi process management is also reliable.

pdhborges

2 months ago

1 reply

What did you try to debug this?

alex_hirner

2 months ago

memray and later a custom request wrapper that output python gc statistics. Our main candidates for the leak are: grpc, asgi server itself, psycopg, django channels. All leaked to some degree. Alas, it did not became clear what caused the runaway leak at 30 MB/s. Capturing flamegraphs just before OOM kills would require some more engineering. Granian contained these situations until upgradings later that year made the system more stable to begin with.

sigwinch

2 months ago

A board tracking progress on libraries:

https://hugovk.github.io/free-threaded-wheels/

Spivak

2 months ago

1 reply

> On asynchronous protocols like ASGI, despite the fact the concurrency model doesn't change that much – we shift from one event loop per process, to one event loop per thread – just the fact we no longer need to scale memory allocations just to use more CPU is a massive improvement.

It's nice that someone else recognizes that event loop per thread is the way. I swear if you said this online any time in the past few years people looked at you like you insulted their mother. It's so much easier to manage even before the performance improvements.

yupyupyups

2 months ago

3 replies

No, because you can't kill a Python thread, but you can kill a process. That is a significant limitation to think about, especially when executing something long-running and CPU intensive such as large scientific computations.

If your thread gets stuck, the only recourse you will have is to kill the entire parent process.

hunterpayne

2 months ago

1 reply

Wow, I seriously question the quality of any project where this is a consideration. I would also strongly recommend you hire some better devs and rewrite that project in another language with better concurrency features. Java (or anything on the JVM) would lead that list but plenty of languages would be suitable.

Also, sharing memory between processes is very very very slow compared to sharing memory between threads.

yupyupyups

2 months ago

The assumption is that they never share memory.

Any time you have an arbitrary independent task that can be started and then stopped by the user, you will need processes in Python.

Java is a lot better at concurrency and has a vast library for concurrency primitives like atomic operations (CAS etc.)

Python's strengths lies in its strong ecosystem and ease of use. Many times that overshadows the benefits of Java's competent concurrency features.

jeremyjh

2 months ago

1 reply

If I worked on a project this shitty I sure wouldn't be telling people about it in public. Just the fact that you think this is advice other people need to hear is telling.

yupyupyups

2 months ago

1 reply

What advice? I'm not sure what you mean.

Do you mean that there is no valid use-case where you wish to interrupt a blocking thread due to I/O, for example?

Or maybe you think Python can do that. Maybe I'm wrong, but as far as I can tell Python is not good at doing that.

Perhaps my choice of word "kill" confused you, and in that case I should've picked better wording reflecting what I meant. Perhaps you thought I meant people should kill threads/processes instead of fixing buggy CPU-intensive tasks? That's certainly not what I meant.

You don't seem to be very charitable in how you interpret what I'm saying. Instead of a very vague comment, it would've been better if you'd have explained your concern, and given me a chance to understand you.

int_19h

2 months ago

1 reply

> Do you mean that there is no valid use-case where you wish to interrupt a blocking thread due to I/O, for example?

This is certainly a common case, but killing a thread with a blocking native call on it is a very poor way to do so (not the least because you don't know what locks it might be holding at the moment it gets killed - imagine what happens if that's one of the locks used by low-level heap, for example). The proper way to address it is to use asynchronous I/O APIs that allow for cooperative cancellation. Unfortunately Linux doesn't exactly have a good track record in that department, which is why people do these kinds of hacks. On Windows you get stuff like https://learn.microsoft.com/en-us/windows/win32/fileio/cance....

yupyupyups

2 months ago

I shouldn't have used the word "kill" in the context of talking about threads, that seems to have created some misunderstandings.

>The proper way to address it is to use asynchronous I/O APIs that allow for cooperative cancellation.

I completely agree here. Async is the go-to for IO-bound operations and the ability to cancel (sends an exception to the task) is a very useful feature.

Killing threads, as in non-gracefully stopping them is a bad idea regardless of language, and not something I would encourage nor something I do myself.

In Python, if there is CPU-bound long-running routine that need to be executed concurrently, then this can be done using multiprocessing. I'd say, if external resource usage is well-defined then a good way to stop such a task would be to send a sigterm, wait, then send a signal 9 if it hasn't stopped after a grace period. If needed, perform a cleanup afterwards.

The problem with python and threads in my experience is that even a graceful interruption of individual threads can be tedious to get right.

Thanks for the link btw.

tommmlij

2 months ago

That plus you can have memory leaks when you run heavy stuff in threads...

rogerbinns

2 months ago

4 replies

C code needs to be updated to be safe in a GIL free execution environment. It is a lot of work! The pervasive problem is that mutable data structures (lists, dict etc) could change at any arbitrary point while the C code is working with them, and the reference count for others could drop to zero if *anyone* is using a borrowed reference (common for performance in CPython APIs). Previously the GIL protected where those changes could happen. In simple cases it is adding a critical section, but often there multiple data structures in play. As an example these are the changes that had to be done to the standard library json module:

https://github.com/python/cpython/pull/119438/files#diff-efe...

This is how much of the standard library has been audited:

https://github.com/python/cpython/issues/116738

The json changes above are in Python 3.15, not the just released 3.14.

The consequences of the C changes not being made are crashes and corruption if unexpected mutation or object freeing happens. Web services are exposed to adversity so be *very* careful.

It would be a big help if CPython released a tool that could at least scan a C code base to detect free threaded issues, and ideally verify it is correct.

westurner

2 months ago

1 reply

> It would be a big help if CPython released a tool that could at least scan a C code base to detect free threaded issues, and ideally verify it is correct.

Create or extend a list of answers to:

What heuristics predict that code will fail in CPython's nogil "free threaded" mode?

rogerbinns

2 months ago

1 reply

Some of that is already around, but scattered across multiple locations. For example there is a list in the Python doc:

https://docs.python.org/3/howto/free-threading-extensions.ht...

And a dedicated web site:

https://py-free-threading.github.io/

But as an example neither include PySequence_Fast which is in the json.c changes I pointed to. The folks doing the auditing of stdlib do have an idea of what they are looking for, and so would be best suited to keep a list (and tool) up to date with what is needed.

westurner

2 months ago

A list of Issue and PR URLs that identify and fix free threading issues would likely also be of use for building a 2to3-like tool to lint and fix C extensions to work with CPython free threading nogil mode

dehrmann

2 months ago

3 replies

I think Java got this mostly right. On the threading front, very little is thread-safe or atomic (x += 1 is not thread-safe), so as soon as you expose something to threads, you have to think about safe access. For interacting with C code, your choices are either shared buffers or copying data between C and Java. It's painful, but it's needed for memory safety.

colonCapitalDee

2 months ago

3 replies

Is compound assignment atomic in any major language?

Groxx

2 months ago

1 reply

it has been in Python due to the GIL.

i80and

2 months ago

1 reply

It's not atomic even with the GIL, though: another thread can run in between the bytecode's load and increment, right?

The GIL's guarantees didn't extend to this.

Groxx

2 months ago

There is NB_INPLACE_ADD... but I'm struggling to find enough details to be truly confident :\ possibly its existence is misleading other people (thus me) to think += is a single operation in bytecode.

Or, on further reading, maybe it applies to anything that implements `_iadd_` in C. Which does not appear to include native longs: https://github.com/python/cpython/blob/main/Objects/longobje...

delusional

2 months ago

Python and Javascript (in the browser) due to their single threaded nature. C++ too as long as you have a std::atomic on the left hand side (since they overload the operator).

arccy

2 months ago

no... but some languages may disallow simultaneously holding a reference in different execution threads

rogerbinns

2 months ago

1 reply

The core Python data structures are atomic to Python developers. eg there is no way you can corrupt a list or dictionary no matter how much concurrency you try to use. This was traditionally done under the protection of the global interpreter lock which ensured that only one piece of C code at a time was operating with the internals of those objects. C code can also release the GIL eg during I/O, or operations in other libraries that aren't interacting with Python objects, allowing concurrency.

The free threaded implementation adds what amounts to individual object locks at the C level (critical sections). This still means developers writing Python code can do whatever they want, and they will not experience corruption or crashes. The base objects have all been updated.

Python is popular because of many extensions written in C, including many in the standard library. Every single piece of that code must be updated to operate correctly in free threaded mode. That is a lot of work and is still in progress in the standard library. But in order to make the free threaded interpreter useful at this point, some have been marked as free thread safe, when that is not the case.

hunterpayne

2 months ago

4 replies

So its the worst of all possible worlds then. It has the poorest performance due to forced locking even when not necessary and if you load a library in another language (C), then you can still get corruptions. If you really care about performance, probably best to avoid Python entirely, even when its compiled like it is in CPython.

PS For extra fun, learn what the LD_PRELOAD environmental variable does and how it can be used to abuse CPython (or other things that dynamically load shared objects).

AlphaSite

2 months ago

1 reply

It’s another step in the right direction. These things take time.

ReflectedImage

2 months ago

Arguably, it's a step in the wrong direction. Share memory by communicating is already doable in Python with Pipe() and Queue() and side steps the issue entirely.

rogerbinns

2 months ago

1 reply

It is multiple fine grained locking versus a single global lock. The latter lets you do less locking, but only have a single thread of execution at a time. The former requires more locking but allows multiple concurrent threads of execution. There is no free lunch. But hardware has become parallel so something has to be done to take advantage of that. The default Python remains the GIL version.

The locking is all about reading and writing Python objects. It is not applicable to outside things like external libraries. Python objects are implemented in C code, but Python users do not need to know or care about that.

As a Python user you cannot corrupt or crash things by code you write no matter how hard you try with mutation and concurrency. The locking ensures that. Another way of looking at Python is that it is a friendly syntax for calling code written in C, and that is why people use it - the C code can be where all the performance is, while retaining the ergonomic access.

C code has to opt in to free threading - see my response to this comment

https://news.ycombinator.com/item?id=45706331

It is true that more fine grained locking can end up being done than is strictly necessary, but user's code is loaded at runtime, so you don't know in advance what could be omitted. And this is the beginning of the project - things will get better.

Aside: Yes you can use ctypes to crash things, other compiled languages can be used, concurrency is hard

int_19h

2 months ago

It depends on how you define "corruption". You can't get a torn read or write, or mess up a collection to the point where attempts to use it will segfault, sure. You can still end up with corrupt data in a sense of not upholding the expected logic invariants, which is to say, it's still corrupt for any practical purpose (and may in turn lead to taking code paths that are not supposed to ever happen etc).

Demiurge

2 months ago

> If you really care about performance, probably best to avoid Python entirely

This has been true forever. Nothing more needs to be said. Please, avoid Python.

On the other hand, I’ve never had issues with Python performance, in 20 years of using it, for all the reasons that have been beaten to death.

It’s great that some people want to do some crazy stuff to CPython, but honestly, don’t hold your breath. Please don’t use Python if Python interpreter performance is your top concern.

int_19h

2 months ago

A library written in another language would have a Python extension module wrapping it, which would still hold the GIL for the duration of the native call (it can be released, but this is opt-in not opt-out), so that is usually not the issue with this arrangement.

The bigger problem is that it teaches people dangerously misguided notions such as "I don't need to synchronize if I work with built-in Python collections". Which, of course, is only true if a single guaranteed-atomic operation on the collection actually corresponds to a single logical atomic operation in your algorithm. What often happens is people start writing code without locks and it works, so they keep doing it until at some point they do something that actually requires locking (like atomic remove from one collection & add to another) without realizing that they have crossed a line.

Interestingly, we've been there before, multiple times even. The original design of Java collections entailed implicit locking on every operation, with the same exact outcome. Then .NET copied that design in its own collections. Both frameworks dropped it pretty fast, though - Java in v1.2 and .NET in v2.0. But, of course, they could do it because the locking was already specific to collections - it wasn't a global lock used for literally every language object, as in Python.

hunterpayne

2 months ago

1 reply

> x += 1 is not thread-safe

Nit, that's true iff x is a primitive without the volatile modifier. That's not true for a volatile primitive.

wmanley

2 months ago

1 reply

Even with volatile it’s a load and then a store no? It may not be undefined behaviour, but I don’t think it will be atomic.

diek

2 months ago

You're correct. If you have:

  public int someField;
 
  public void inc() {
    someField += 1;
  }

that still compiles down to:

  GETFIELD [someField]
  ICONST_1
  IADD
  PUTFIELD [somefield]

whether 'someField' is volatile or not. The volatile just affects the load/store semantics of the GETFIELD/PUTFIELD ops. For atomic increment you have to go through something like AtomicInteger that will internally use an Unsafe instance to ensure it emits a platform-specific atomic increment instruction.

radarsat1

2 months ago

2 replies

I agree and honestly it may as well be considered a form of ABI incompatibility. They should make this explicit such that existing C extensions need to be updated to use some new API call for initialization to flag that they are GILless-ready, so that older extensions cannot even successfully be loaded when GIL is disabled.

electroglyph

2 months ago

the problem with that is it effects the entire application and makes the whole thing free-threading incompatible.

it's quite possible to make a python app that requires libraries A and B to be able to be loaded into a free-threaded application, but which doesn't actually do any unsafe operations with them. we need to be able to let people load these libraries, but say: this thing may not be safe, add your own mutexes or whatever

rogerbinns

2 months ago

This has already been done. There is a 't' suffix in the ABI tag.

You have to explicitly compile the extension against a free threaded interpreter in order to get that ABI tag in your extension and even be able to load the extension. The extension then has to opt-in to free threading in its initialization.

If it does not opt-in then a message appears saying the GIL has been enabled, and the interpreter continues to run with the GIL.

This may seem a little strange but is helpful. It means the person running Python doesn't have to keep regular and free threaded Python around, and duplicate sets of extensions etc. They can just have the free threaded one, anything loaded that requires the GIL gives you the normal Python behaviour.

What is a little more problematic is that some of the standard library is marked as supporting free threading, even though they still have the audit and update work outstanding.

Also the last time I checked, the compiler thread sanitizers can't work with free threaded Python.

sgammon

2 months ago

1 reply

> “at least scan a C code base to detect free threaded issues”

if such a thing were possible, thread coordination would not have those issues in the first place

rogerbinns

2 months ago

Some examples of what it could do when using the C Python APIs:

* Point out using APIs that return borrowed references

* Suggest assertions that critical sections are held when operating on objects

* Suggest alternate APIs

* Recognise code patterns that are similar to those done during the stdlib auditing work

The compiler thread sanitizers didn't work the last time I checked - so get them working.

Edit: A good example of what can be done is Coccinelle used in the Linux kernel which can detect problematic code (locking is way more complex!) as well as apply source transformations. https://www.kernel.org/doc/html/v6.17/dev-tools/coccinelle.h...

btbuilder

2 months ago

1 reply

This is fantastic progress for CPython. I had almost given up hope that CPython would overcome the GIL after first hitting its limitations over 10 years ago.

That being said I strongly believe that because of the sharp edges on async style code vs proper co-routine-based user threads like go-routines and Java virtual threads Python is still far behind optimal parallelism patterns.

rowanG077

2 months ago

2 replies

Aren't go-routines the worst of all worlds? Sharp edges, undefined behavior galore? At least that was my takeaway when last using about 5 or 6 years ago. Did they fix go-routines in the meantime?

btbuilder

2 months ago

1 reply

It’s hard to answer without specifics but languages shouldn’t require you to determine whether it’s safe to use an api in an async context or whether it will hang your app. I imagine some of the sharp edges you might have run into are because go has real parallelism and you have to address data sharing.

int_19h

2 months ago

The sharp edges in Go are when you try to use the built-in mechanisms that are supposed to replace data sharing - e.g. channels - only to discover the numerous footguns that abound there.

And then there's patently stupid design decisions like using raw slices as collections and the maybe-change-maybe-copy semantics of append() that don't make it easier to reason about shared data when it needs to be shared.

weakfish

2 months ago

I like them, but I’m doing pretty simple back/end dev in Go, just microservices.

callamdelaney

2 months ago

1 reply

Python gets more bloated weekly in my view.

nine_k

2 months ago

1 reply

Removing something, e.g. removing GIL, is usually the opposite of bloat.

ViscountPenguin

2 months ago

Removing the GIL really amounts to adding a bunch of concurrency code all over th cPython codebase. It kind of sucks tbh.

nodesocket

2 months ago

2 replies

I'm running a Python 3.13 Flask app in production using gunicorn and gevent (workers=1) with gevent monkey patching. Using ab can get around 320 requests per second. Performance is decent but I'm wondering how much a lift would be required to migrate to FastAPI. Would I see performance increases staying with gunicorn + gevent but upgrading Python to 3.14?

stackskipton

2 months ago

1 reply

We have similar at work, 3.14 should just be a Dockerfile change away.

FastAPI might improve your performance by a little but seriously, either PyPy or rewriting into compiled language.

nodesocket

2 months ago

I did some quick tests increasing workers=2 and workers=3 and requests per second nearly scaled linearly so seems just throwing more CPU cores is the quick answer in the mid-term.

nine_k

2 months ago

Did you profile your code? Is it CPU-bound or IO-bound? Does it max out your CPU? Usually it's the DB access that determines the single-threaded performance of backend code.

shdh

2 months ago

Hadn’t heard of Granian before, thinking about upgrading to 3.14 for my services and running them threaded now

waldrews

2 months ago

For math/data-sci/ML, multiple (GIL-bound) interpreters per process, with (unsafely) shared data structures, would get us much of the way there - basically multiprocessing without the marshalling and process overhead, at the price of pinky-swearing we won't mutate the shared data. That would enable, for example, calling Python-world utilities in-process from a properly multi-threaded language, which can be used to bootstrap e.g. the Julia or Golang (or C#, or Rust...) ecosystems with all the math that's now locked into Python world. If we want NumPy/Scikit etc. to be accessible via thin wrappers, it's tolerable if the Python layer is slow, but importing the GIL into the host language is too high a price.

NeutralForest

2 months ago

Nice that someone takes the time to crunch the number, thanks! I know there's some community effort in how to use free-threaded Python: https://py-free-threading.github.io/

I've found debugging Python quite easy in general, I hope the experience will be great in free-threaded mode as well.

View full discussion on Hacker News

ID: 45633311Type: storyLast synced: 11/20/2025, 4:47:35 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN