Implicit Ode Solvers Are Not Universally More Robust Than Explicit Ode Solvers

Posted4 months agoActive3 months ago

cbolton

114 points

40 comments

stochasticlifestyle.comResearchstory

calmmixed

Debate

70/100

Numerical MethodsOde SolversComputational Physics

Key topics

Numerical Methods

Ode Solvers

Computational Physics

The article challenges the assumption that implicit ODE solvers are universally more robust than explicit ODE solvers, sparking a discussion on the nuances of numerical methods and their applications in various fields.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

Peak period

Day 1

Avg / period

6.7

Comment distribution40 data points

Loading chart...

Based on 40 loaded comments

Key moments

01Story posted
Sep 16, 2025 at 9:41 AM EDT
4 months ago
Step 01
02First comment
Sep 16, 2025 at 10:55 AM EDT
1h after posting
Step 02
03Peak activity
34 comments in Day 1
Hottest window of the conversation
Step 03
04Latest activity
Sep 26, 2025 at 3:22 PM EDT
3 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (40 comments)

Showing 40 comments

abhinavsns

4 months ago

1 reply

This is taught in the intro class in basic numerics. Methods are considered more robust and practical if they have larger stability regions. Von-Neumann is the OG analyst for PDEs

ChrisRackauckas

4 months ago

The whole point of the article is that "methods are considered more robust and practical if they have larger stability regions" isn't (or at least shouldn't be) necessarily true :-/.

dataflow

4 months ago

1 reply

> Those kinds of models aren’t really realistic? Energy goes to infinity, angular momentum goes to infinity, the chemical concentration goes to infinity: whatever you’re modeling just goes crazy! If you’re in this scenario, then your model is probably wrong. Or if the model isn’t wrong, the numerical methods aren’t very good anyways.

Uh... that's not how it works. The whole point of making a model and using a simulator is you don't already know what's going to happen. If your system is going to blow up, your model needs to tell you that. If it does, that's good, because it caught your mistake. And if it doesn't, it's obviously wrong, possibly in a deadly way.

tonyarkles

4 months ago

1 reply

Having worked on non-linear simulators/ODE solvers off and on for a decade, I agree and disagree with what you're saying.

I agree with you that that is 100% the point: you don't already know what's going to happen and you're doing modelling and simulation because it's cheaper/safer to do simulation than it is to build and test the real physical system. Finding failure modes, unexpected instability and oscillations, code bugs, etc. is an absolutely fantastic output from simulations.

Where I disagree: you don't already know what's going to happen, but you DO know generally what is going to happen. If you don't have, at a minimum, an intuition for what's going to happen, you are going to have a very hard time distinguishing between "numerical instability with the simulation approach taken", "a bug in the simulation engine", "a model that isn't accurately capturing the physical phenomena", and "an unexpected instability in an otherwise reasonably-accurate model".

For the first really challenging simulation engine that I worked on early on in my career I was fortunate: the simulation itself needed to be accurate to 8-9 sig figs with nanosecond resolution, but I also had access to incredibly precise state snapshots from the real system (which was already built and on orbit) every 15 minutes. As I was developing the simulator, I was getting "reasonable" values out, but when I started comparing the results against the ground-truth snapshots I could quickly see "oh, it's out by 10 meters after 30 minutes of timesteps... there's got to be either a problem with the model or a numerical stability problem". Without that ground truth data, even just identifying that there were missing terms in the model would have been exceptionally challenging. In the end, the final term that needed to get added to the model was Solar Radiation Pressure; I wasn't taking into account the momentum transfer from the photons striking the SV and that was causing just enough error in the simulation that the results weren't quite correct.

Other simulations I've worked on were more focused on closed-loop control. There was a dynamics model and a control loop. Those can be deceptive to work on in a different way: the open-loop model can be surprisingly incorrect and a tuned closed-loop control system around the incorrect model can produce seemingly correct results. Those kinds of simulations can be quite difficult to debug as well, but if you have a decent intuition of the kinds of control forces that you aught to expect to come from the controller, you can generally figure out if it's a bad numerical simulation, bad model, or good model of a bad system... but without those kinds of gut feelings and maybe envelope math it's going to be challenging and it's going to be easy to trick yourself into thinking it's a good simulation.

srean

4 months ago

1 reply

Love these kind of comments where I get to learn about things that I haven't played with.

> the simulation itself needed to be accurate to 8-9 sig figs with nanosecond resolution

What kind of application would require such demanding tolerances ? My first thought was nuclear fission. But then the fact that you had one in orbit sending data feed every 15 mins imploded my fanciful thinking.

tonyarkles

4 months ago

1 reply

Ahh, I really wish I could go into all of the details because it was a really cool project. The high-level answer (which is also related to the software-defined radio post that's on the front page right now) is that it was simulating the orbit of a satellite such that we could simulate the signals the satellite was transmitting with enough accuracy to implicitly get Doppler shift. That is... we didn't explicitly model Doppler shift in any piece of the system, it just showed up as a result of accurately modelling the orbit of the satellite's position and velocity relative to the receiver on Earth.

srean

4 months ago

1 reply

Fantastic.

My second guess was that this was a part of a GPS like service.

tonyarkles

4 months ago

1 reply

Without elaborating, I'll just say... you're not wrong.

srean

4 months ago

Thanks for humouring. Must have been a very interesting project.

timefirstgrav

4 months ago

4 replies

This aligns with what I've observed in computational physics.

Trying to handle instantaneous constraints and propagating modes with a single timestepper is often suboptimal.

I developed a framework that systematically separates these components: using direct elliptic solvers for constraints and explicit methods for flux evolution. The resulting algorithms are both more stable and more efficient than unified implicit approaches.

The key insight is that many systems (EM, GR, fluids, quantum mechanics) share the same pattern: - An elliptic constraint equation (solved directly, not timestepped) - A continuity law for charge/mass/probability flux - Wave-like degrees of freedom (handled with explicit methods)

Witgh this structure, you can avoid the stiffness issues entirely rather than trying to power through them with implicit methods.

Paper: https://zenodo.org/records/16968225

tomrod

4 months ago

1 reply

> Trying to handle instantaneous constraints and propagating modes with a single timestepper is often suboptimal.

When I read statements like this, I wonder if this is related to optimal policy conditions required for infinitely lived Bellman equations to have global and per-period policies in alignment

timefirstgrav

4 months ago

1 reply

That's a fascinating parallel! both involve separating timeless constraints (value function / elliptic equation) from temporal dynamics (policy / flux evolution).

Trying to timestep through a constraint that should hold instantaneously creates artificial numerical difficulties. The Bellman equation's value iteration is actually another example of this same pattern...

tomrod

4 months ago

The core conditions for Bellman policy equivalence are pretty straightforward and handled in Stokey/Lucas, Recursive Dynamics:

[1] Discounting: The discount factor β ∈ (0,1) is crucial. It ensures convergence of the value function and prevents “infinite accumulation” problems.

[2] Compactness of state/action sets: The feasible action correspondence Γ(x) is nonempty, compact-valued, and upper hemicontinuous in the state x. The state space X is compact (or at least the feasible set is bounded enough to avoid unbounded payoffs).

[3] Continuity: The return (or reward) function u(x,a) is continuous in (x,a). The transition law f(x,a) is continuous in (x,a).

[4] Bounded rewards: u(x,a) is bounded (often assumed continuous and bounded). This keeps the Bellman operator well-defined and ensures contraction mapping arguments go through.

[5] Contraction mapping property: With discounting and bounded payoffs, the Bellman operator is a contraction on the space of bounded continuous functions. This guarantees existence and uniqueness of the value function V.

[6] Measurable selection for policies: Under the above continuity and compactness assumptions, the maximum in the Bellman equation is attained, and there exists a measurable policy function g(x) that selects optimal actions.

wizzwizz4

4 months ago

1 reply

Please don't "publish" on Zenodo. If you think your work has merit, go arXiv -> peer review -> open access journal. Otherwise, put it on your own website. Zenodo is a repository for artefacts (mainly datasets): if you try to put papers on it, people will think you're a crank. It's about as damaging for your reputation (and the reputation of your work) as a paper mill.

Of course, make sure you've done a thorough literature search, and that your paper is written from the perspective of "what is the contribution of this paper to the literature?", since most people reading your work will not read it in isolation: it'll be the hundredth or thousandth paper they've skimmed, trying to find those dozen papers relevant to their work.

abdullahkhalids

4 months ago

1 reply

Just to add, people in the field are unlikely to find a paper on Zenodo. I don't think any of the major search engines or databases for papers will include anything on Zenodo in their results.

That said, posting on arXiv can't be done unless someone vouches for you, which might be difficult or not for an independent person.

I think the best bet would be to submit your paper directly to a journal. However the paper in GP is unlikely to be published by any reputable journal. One direct feedback: if you can't explain at the start why your paper is relevant to current researcher's then why should anyone care? A sniff test for this is that you discuss in the introduction recent papers which have tried to solve the same or similar problems. But GP paper's references are over two decades old.

wizzwizz4

4 months ago

Getting someone to endorse you on arXiv is easy if you have work to show. (If you don't get endorsed, it's theoretically possible that the endorser might plagiarise your work, but you'll have a paper trail to prove it; I've heard more stories of PhD supervisors plagiarising their students' work than of arXiv endorsers doing so.) It's hard for us plebs who don't yet have work to put on the arXiv to get endorsements, but that's not actually much of a problem.

physicsguy

4 months ago

In practical terms it’s not unusual to reset the integrator when something instantaneous happens. When i did magnetic research, an application of instantaneous field for e.g. usually required this because otherwise the adaptive integrator spends a lot of time reducing the time step size

semi-extrinsic

4 months ago

You are probably very familiar with it, but this has been the basis of most numerical solvers for the Navier-Stokes equations since the late 1960s:

https://en.wikipedia.org/wiki/Projection_method_(fluid_dynam...

A disadvantage is that you get a splitting error term that tends to dominate, so you gain nothing from using higher-order methods in time.

stephencanon

4 months ago

3 replies

The orbital example where BDF loses momentum is really about the difference between a second-order method (BDF2) and a fourth-order method (RK), rather than explicit vs implicit (but: no method with order > 2 can be A-stable; since the whole point of implict methods is to achieve stability, the higher order BDF formulas are relatively niche).

There are whole families of _symplectic_ integrators that conserve physical quantities and are much more suitable for this sort of problem than either option discussed. Even a low-order symplectic method will conserve momentum on an example like this.

_alternator_

4 months ago

1 reply

The fascinating thing is that discrete symplectic integrators typically can only conserve one of the physical quantities exactly, eg angular momentum but not energy in orbital mechanics.

srean

4 months ago

1 reply

I have always wanted to know if there is any theorem that says one cannot preserve all of the standard invariants.

For example, we know for mappings that we cannot preserve angles, distances and area simultaneously.

_alternator_

4 months ago

1 reply

The short answer is that discretization can generally preserve only one invariant exactly; others must be approximate.

This could provide some evidence for the universe not being truly discrete since we have multiple apparent exactly preserved kinematic quantities, but it’s hard to tell experimentally since proposed discrete space times have discretization sizes on the order of hbar, which means deviations from the continuum would be hard to detect.

srean

3 months ago

Thanks for replying on this now rather inactive thread.

I am really curious about this issue and am looking for a theorem that gives an impossibility result (or an existence result).

It might be well known but I don't know DE to be aware about f the result.

ekelsen

4 months ago

leapfrog!

Certhas

4 months ago

Obviously^1. But it illustrates the broader point of the article, even if for the concrete problem even better choices are available.

1) if you have studied these things in depth. Which many/most users of solver packages have not.

groos

4 months ago

2 replies

I always see RK(Fe) mentioned but almost never Bulirsch-Stoer (https://en.wikipedia.org/wiki/Bulirsch%E2%80%93Stoer_algorit...). Some years ago, I was looking to do high precision orbital motion (involving several bodies) simulations, and came across this in Numerical Recipes for C. I don't pretend to understand how B-S works, but it produced amazing precision with adaptive step size adjustment. For highly chaotic systems, it held out far longer than RK.

adgjlsfhk1

4 months ago

1 reply

Unfortunately the Numerical Recipies book is known for being pretty horrible in general for ODEs. For chaotic systems at high tolerances, you likely want to be using Vern9 which is a highly efficient 9th order RK solver.

groos

4 months ago

1 reply

Is there a reference for ODEs that you like?

adgjlsfhk1

4 months ago

1 reply

Hairer and Wanner (Solving Ordinary Differential Equations) is the classic if you want a graduate level understanding of the mathematical details of all the different methods, but it is incredibly dense (thorough understanding of both volumes will take months to years). If you want a more approachable reference, I would recommend https://book.sciml.ai/ (specifically chapters 7-9) which skips over most of the detail on how to construct ODE solvers but covers quite well the methods that tend to be used in practice.

groos

4 months ago

Many thanks.

ChrisRackauckas

4 months ago

B-S is just a (bad) explicit Runge-Kutta method. You can see this formally since if you take any order of the B-S method, you can write out its computation as a Runge-Kutta tableau. If you do this, you'll see it ends up being an explict RK method that has much higher f evaluations to achieve the leading truncation error coefficients and order that the leading RK methods get. For example, 9th order B-S uses IIRC about 2-3 times as many f evaluations as Vern9, while Vern9 gets like 3 orders of magnitude less error (on average w.r.t. LTE measurements). If you're curious about details of optimizing RK methods, check out https://www.youtube.com/watch?v=s_t6dIKjUUc.

That isn't to say B-S methods aren't worth anything. The interesting thing is that they are a way to generate arbitrary order RK methods (even if no specific order is efficient, it gives a general scheme) and these RK methods have a lot of parallelism that can be exploited. We really worked through this a few years ago building parallelized implementations. The results are here: https://ieeexplore.ieee.org/abstract/document/9926357/ (free version https://arxiv.org/abs/2207.08135) and are part of the DifferentialEquations.jl library. The results are a bit mixed though, which is why they aren't used as the defaults. For non-stiff equations, even with parallelism it's hard to ever get them to match the best explicit RK methods, they just aren't efficient enough.

For stiff equations, if you are in the regime where BLAS does not multithread yet (basically <256 ODEs, so 256x256 LU factorizations but this is BLAS/CPU-dependent), then you can use parallelization in the form of the implicit extrapolation methods in order to factorize different matrices at the same time. Because you're doing more work than something like an optimized Rosenbrock method, you don't get strictly better, but using this to get a high order method that exploits parallelism where other methods end up being serial, you can get a good 2x-4x if you tune the initial order and everything well. This is really nice because 4 ODEs - 200 ODEs, where this ends up being the fastest method for stiff ODEs that we know of right now according to the SciMLBenchmarks https://docs.sciml.ai/SciMLBenchmarksOutput/stable/, and this ends up being a sweet spot where "most" user code online tends to be. However, since it does require tuning a bit to your system, setting threads properly, having your laptop plugged in, using more memory, and possibly upping the initial order manually, it's not the most user-friendly method. Therefore it doesn't make a great default since you're putting a lot more dependencies into the user's mind for a 2x-4x... but it's a great option to have for the person really trying to optimize. This also shows different directions that would be more fruitful and beat this algorithm pretty handedly though... but that research is in progress.

inasio

4 months ago

1 reply

I was expecting a mention of symplectic ODE solvers, although perhaps that was beyond the scope of the blog post. For Hamiltonian ODEs, you can design methods that explicitly preserve energy, outperforming more generic methods.

ChrisRackauckas

4 months ago

Yeah, that's a separate post https://scicomp.stackexchange.com/questions/29149/what-does-.... I wanted to keep this post as simple as possible. If you could show cases where explicit Runge-Kutta methods outperform (by some metrics) implicit Runge-Kutta methods, then it leads to a whole understanding of "what matters is what you're trying to measure". And then yes, symplectic integrators are for long time integrations (explicit RK methods will how lower error for shorter time, so its specifically longer time integrations on symplectic manifolds, though there are implicit RK methods which are symplectic but ... tradeoff tradeoff)

LegionMammal978

4 months ago

3 replies

In the vast majority of cases, people using ODE solvers are working with physical systems, where there likely aren't more than a dozen digits of accuracy in any of the parameters. So you set some fixed level of accuracy for the output, and seek a method that can reliably attain that accuracy at the lowest cost. (Either that, or the system is too chaotic, in which case you're told not to bother since the output would be meaningless anyway.)

Something I've had trouble finding any information about is, what if we have a mathematical system, where the initial configuration is known to infinite precision, and we want to solve the ODE to an arbitrary precision (say, 100 digits, or 1000 digits)? All the usual stepwise methods would seemingly require a very high order to avoid a galactic step count, but initializing all the coefficients, etc., would be a struggle in itself.

Is this the best that can be done, or are there less-usual methods that would win out at extremely high accuracy levels?

kronicum2025

4 months ago

1 reply

It depends on your dimensions of the system. If your system is one or low dimensional or if the function has sufficient smoothness, then it makes sense to assume that the function is a series of some kind, and then iterative solve on the least squares using gradient descent. If you pick the correct kind of series, you would get extremely quick convergence.

When number of dimensions become very high or if the function becomes extremely irregular, then variations of monte carlo are generally extremely better than any other methods and have much higher accuracy than other methods, but the accuracy is still much lower than low dimensional methods.

LegionMammal978

4 months ago

1 reply

> It depends on your dimensions of the system. If your system is one or low dimensional or if the function has sufficient smoothness, then it makes sense to assume that the function is a series of some kind, and then iterative solve on the least squares using gradient descent. If you pick the correct kind of series, you would get extremely quick convergence.

Thank you, this sounds like what I'm looking for. Would you know of any further resources on this? Most of what I've been playing with has 4 dimensions or fewer.

(E.g., in one particular problem that vexed me, I had a 2D vector x(t) and an ODE x'(t) = F(t,x(t)), where F is smooth and well-behaved except for a singularity at the origin x = (0,0). It always hits the origin in finite time, so I wanted to calculate that hitting time from a given x(0). Series solutions work well in the vicinity of the origin, the problem is accurately getting to that vicinity.)

kronicum2025

4 months ago

In such cases, you do need to switch between methods. If you already know how to deal with origin, then getting to origin can be done with a normal stepper. Decide how low you are okay with your step to go. Set that as your target `t_target` which would typically be a very small number. Approach that number with normal stepping methods and then once you are there, start your series solve.

krull10

4 months ago

1 reply

Use spectral methods? But you’d presumably also have to use higher precision number types if you want that many digits.

LegionMammal978

4 months ago

Of course you'd have to use arbitrary-precision arithmetic, that's the easy part. The hard part is knowing what to do with it, if you're someone like me who isn't an expert in ODEs. Do you know of any accessible resources on spectral methods?

(Of course arbitrary precision is far, far worse than double precision in terms of performance, which is why so much work goes into techniques that avoid increasing precision. But sometimes you have few enough instances to solve that you can simply eat the cost.)

adgjlsfhk1

4 months ago

there are adaptive order ODE solvers that can adapt to arbitrarily high order. For example, e.g. https://arxiv.org/abs/2412.14362 (which I'm a co-author on) can get to ~200th order which should be enough to efficiently solve an ODE into 10s of thousands of digits.

View full discussion on Hacker News

ID: 45262151Type: storyLast synced: 11/20/2025, 5:30:06 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN