Building a Ci/cd Pipeline Runner From Scratch in Python

Posted2 months agoActiveabout 2 months ago

mr_o47

68 points

23 comments

muhammadraza.meTechstory

calmmixed

Debate

60/100

Ci/cdPythonDevopsSoftware Development

Key topics

Ci/cd

Python

Devops

Software Development

The article describes building a CI/CD pipeline runner from scratch in Python, sparking discussion on the necessity and design choices of such a project, particularly in air-gapped environments and the use of Docker for job execution.

Snapshot generated from the HN discussion

Discussion Activity

Active discussion

First comment

Peak period

72-84h

Avg / period

4.6

Comment distribution23 data points

Loading chart...

Based on 23 loaded comments

Key moments

01Story posted
Nov 9, 2025 at 2:26 PM EST
2 months ago
Step 01
02First comment
Nov 12, 2025 at 12:52 PM EST
3d after posting
Step 02
03Peak activity
14 comments in 72-84h
Hottest window of the conversation
Step 03
04Latest activity
Nov 14, 2025 at 4:55 PM EST
about 2 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (23 comments)

Showing 23 comments

systemerror

about 2 months ago

3 replies

Why does air-gapped environment require rolling your own CI/CD solution? Plenty of examples of air-gapped Jenkins and/or Argo Workflows. Was this just an educational exercise?

mr_o47Author

about 2 months ago

1 reply

This was more like an educational exercise

esafak

about 2 months ago

1 reply

Since you're exercising, you can take it to the next level where you don't specify the next step but the inputs to each task, allowing you to infer the DAG and implement caching...

verdverm

about 2 months ago

1 reply

You can do this with cue/flow, but have not turned it into a full CI system. The building blocks are there

mr_o47Author

about 2 months ago

1 reply

Never heard of cue/flow will definitely check it out

verdverm

about 2 months ago

cue/flow is a Go package and what powers the scripting / custom tools part of CUE. It's built on top of the language and a graph resolver.

I have another custom flow implementation that I find more ergonomic: https://hofstadter.io/getting-started/task-engine/

verdverm

about 2 months ago

1 reply

Jenkins sucks but is insanely reliable

Argo Workflows does not live up to what they advertise, it is so much more complex to setup and then build workflows for. Helm + Argo is pain (both use the same template delimiters...)

bigstrat2003

about 2 months ago

1 reply

Jenkins, like many tools with extreme flexibility, sucks as much as you make it suck. You can pretty easily turn Jenkins into a monstrosity that makes everyone afraid to ever try to update it. On the other hand, you can also have a pretty solid setup that is easy to work on. The trouble is that the tool itself doesn't guide you much to the good path, so unless you've seen a pleasant Jenkins instance before you're likely to have a worse time than necessary.

IshKebab

about 2 months ago

3 replies

Are you sure, because last time I used Jenkins it actively sucked. The interface was a total mess and it doesn't surface results in any useful way.

fowlie

about 2 months ago

2 replies

When was the last time you used Jenkins? I don't get the hate. Not only from you, but lots of people on the internet. What makes Jenkins stand out IMO is the community and the core maintainers, they are perhaps moving slow, but they are moving in the right directions. The interface looks really nice now, they've done a lot of ux improvements lately.

larkost

about 2 months ago

I haven't used Jenkins in a few years, so some of this might change, but in working with it I saw that Jenkins has a few fundamental flaws that I don't see them as working to change:

1. There is no central database to coordinate things. Rather it tries to manage serialization of important bits to/from XML for a lot of things, for a lot of concurrent processes. If you ever think you can manage concurrency better than MySQL/Postgres, you should examine your assumptions.

2. In part because of the dance-of-the-XMLs, when a lot of things are running at the same time Jenkins starts to come to a crawl, so you are limited on the number of worker nodes. At my last company that used Jenkins they instituted rules to keep below 100 worker nodes (and usually less than that) per Jenkins. This lead to fleets of Jenkins servers (and even a Jenkins server to build Jenkins servers as a service), and lots of wasted time for worker nodes.

3. "Everything is a plugin" sounds great, but it winds up with lots of plugins that don't necessarily work with each other, often in subtle ways. In the community this wound up with blessed sets of plugins that most people used, and then you gambled with a few others you felt you needed. Part of this problem is the choice of XMLs-as-database, but it goes farther than that.

4. The way the server/client protocol works is to ship serialized Java processes to the client, which then runs it, and reserializes the process to ship back at the end. This is rather than having something like RPC. This winds up being very fragile (e.g.: communications breaks were a constant problem), makes troubleshooting a pain, and prevents you from doing things like restarting the node in the middle of a job (so you usually have Jenkins work on a Launchpad, and have a separate device-under-test).

Some of these could be worked on, but there seemed to be no desire in the community to make the large changes that would be required. In fact there seemed to be pride in all of these decisions, as if they were bold ideas that somehow made things better.

verdverm

about 2 months ago

both the old & new interfaces to Jenkins are riddled with bugs, work seems to be maintenance mode, across the plugin ecosystem too

If you are talking about Jenkins-X, that is a different story, it's basically a rewrite to Kubernetes. I haven't talked to anyone actually using it, if you go k8s, you are far more likely to go argo

blackjack_

about 2 months ago

1 reply

What particular issues do you have with it? My company uses it at scale (dozens of different instances, hundreds of workers, thousands of pipelines) to support thousands of applications and we are reasonably happy with it. DSL is incredibly helpful at scale. IAC is incredibly helpful at scale. It requires a good amount of upkeep, but all things underpinning large amounts of infrastructure require a good amount of upkeep.

verdverm

about 2 months ago

We've minimized our usage of the DSL, there is no way for devs to debug it without pushing commits, and it means you have to implement much of your CI logic twice (once for local dev, once for ci system).

IMO, ci should be running the same commands humans would run (or could if it is production settings). Thus our Jenkins pipelines became a bunch of DSL boilerplate wrapped around make commands. The other nice thing about this is that it prepares you for easier migrations to a new ci system

ownagefool

about 2 months ago

Jenkins has pros and cons.

It's one of the few CI tools where you can test your pipeline without committing it. You also have controls such as only pulling the pipeline from trunk, again, something that wasn't always available elsewhere.

However, it can also be a complete footgun if you're not fairly savvy. Pipeline security isn't something every developer groks.

piker

about 2 months ago

It seems like a simple CI/CD in an airgapped environment might be simpler to implement than to (1) learn and (2) onboard an off-the-shelf solution when your airgapped requirement limits your ability to leverage the off-the-shelf ecosystem.

halfcat

about 2 months ago

1 reply

> We need to:

> Build a dependency graph (which jobs need which other jobs)

> Execute jobs in topological order (respecting dependencies)

For what it’s worth, Python has graphlib.TopologicalSorter in the standard library that can do this, including grouping tasks that can be run in parallel:

https://docs.python.org/3/library/graphlib.html

skylurk

about 2 months ago

One of the best real "batteries" added in recent years.

max-privatevoid

about 2 months ago

2 replies

Why use Docker as a build job execution engine? It seems terribly unsuited for this.

ramon156

about 2 months ago

1 reply

> terribly unsuited

Care to elaborate? If you already deploy in docker then wouldn't this be nice?

max-privatevoid

about 2 months ago

Docker is unusable for build tools that use namespaces (of which Docker itself is one), unless you use privileged mode and throw away much more security than you'd need to. Docker images are difficult to reproduce with conventional Docker tools, and using a non-reproducible base image for your build environment seems like a rather bad idea.

mr_o47Author

about 2 months ago

It's widely used among DevOps Engineers so hence I picked Docker as it makes it easier to understand

codeonline

about 2 months ago

I like the iterative approach that you took with the post and codebase. It really takes the reader on a journey with you and helps understand the decision points and process of software development, it's so important compared to just showing a final polished solution, after all, we're all trying to replicate the software development process, not the product.

View full discussion on Hacker News

ID: 45868357Type: storyLast synced: 11/20/2025, 7:55:16 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN