How Container Filesystem Works: Building a Docker-Like Container From Scratch

Posted4 months agoActive4 months ago

lgunsch

183 points

32 comments

labs.iximiuz.comTechstory

calmpositive

Debate

40/100

ContainerizationDockerFilesystem

Key topics

Containerization

Docker

Filesystem

The article explains how container filesystems work by building a Docker-like container from scratch, sparking a discussion on the history and inner workings of containerization technology.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

Peak period

72-84h

Avg / period

Comment distribution32 data points

Loading chart...

Based on 32 loaded comments

Key moments

01Story posted
Sep 13, 2025 at 10:37 AM EDT
4 months ago
Step 01
02First comment
Sep 16, 2025 at 2:56 PM EDT
3d after posting
Step 02
03Peak activity
23 comments in 72-84h
Hottest window of the conversation
Step 03
04Latest activity
Sep 18, 2025 at 4:14 PM EDT
4 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (32 comments)

Showing 32 comments

zoobab

4 months ago

5 replies

We had chroot since 1979, nobody managed to build a docker like wrapper for chroot which do not require netns?

ronsor

4 months ago

1 reply

Chroot has significantly less isolation than Linux namespaces as used by Docker.

miladyincontrol

4 months ago

This, better yet just use systemd-nspawn. Benefits of proper containers, configuration similar to any ol systemd service, super easy to use, simple to automate builds with mkosi.

The one thing people really seem to miss on them is like, contrary to popular belief you dont need a whole OS container there, minimal distroless containers work just fine with systemd-nspawn similar to as they would on docker.

interroboink

4 months ago

5 replies

FreeBSD has had jails since version 4 (~year 2000), fwiw.

Much of the technology was there, but Docker was able to achieve a critical mass, with streamlined workflows. Perhaps as much a social phenomenon as a technical one?

Yeroc

4 months ago

2 replies

I think the real genius of Docker was the image packaging. The pieces were there but delivery and scripting it all wasn't easy.

disagr

4 months ago

BSD jails were no harder to automate than Docker; setup many ci/cd pipelines into jails in the 00s for a variety of applications.

They're way closer to the usual "Unix" tool feel too. Docker feels 100% like an attempt to get rich by engineering a monolith rather than be a few helper C tools. Docker was so annoying to learn.

Fortunately with the end of ZIRP and SaaS deflation (in real user terms, not fake investment to project we still live in the 2010s heyday), software engineers are focused on engineering more than hype generation. Am excited about energy based models, capture of electromagnetic geometry of the machine as it runs programs.

60s style lexical state management systems dragged forward in time because of social momentum have little to do with engineering. Are hardly high tech in 2025.

mikepurvis

4 months ago

Indeed. Even to this day, debootstrap feels a bit arcane and unapproachable, particularly relative to `docker pull ubuntu`.

tkcranny

4 months ago

Yeah it really was a social phenomena. Ten years ago conferences were swarmed with docker employees, swag, plenty of talks and excitement.

The effort to introduce the concepts to the mainstream can’t be understated. It seems mundane now but it took a lot of grassroots effort and marketing to hit that critical mass.

magicalhippo

4 months ago

I used FreeBSD on my firewall in the early 2000s, and on my NAS from around 2007 till last year.

The big pain with jails for me was the tooling. There was a number of non-trivial steps needed to get a jail that could host a networked service, with a lot that could go wrong along the way.

Sure a proper sysadmin would learn and internalize these steps, but as someone who just used it now and again it was a pain.

Way down the line things like iocage came along, but it was fragile and not reliable when I tried it, leading to jails in weird states and such.

So I gave up and moved to Linux so I could use Docker.

Super easy to spin up a new service, and fairly self-documenting as you just configure everything in a script or compose file so much less to remember.

Initially in a VM on Bhyve, now on bare metal.

It feels a bit sad though, as jails had some nice capabilities due to the extra isolation.

jayd16

4 months ago

There was clear incremental progress. Some of it can be seen in how mobile app isolation shook out as well.

oftenwrong

4 months ago

Don't discount the technical innovation required to integrate existing technologies in a novel and useful way. Docker was an "off the shelf" experience unlike any other solution at the time. You could `docker run ...` and have the entire container environment delivered incrementally on demand with almost no setup required. It did have a social factor in that it was easy for people to publish their own images and share them. Docker Hub was provided as a completely free distribution service. The way they made distribution effortless was no doubt a major factor in why it took off.

https://www.youtube.com/watch?v=wW9CAH9nSLs

vbezhenar

4 months ago

4 replies

Docker is a genius idea which looks obvious in retrospect, but someone need to invent it.

Docker is more than just chroot. You also need: overlay file system; OCI registry and community behind it, to create thousands of useful images. And, of course, the whole idea of creating images layer by layer and using immutable images to spawn mutable containers.

I don't actually think that you need network or process isolation. In terms of isolation, chroot is enough for most practical needs. Network and process isolations are nice to have, but they are not essential.

akdev1l

4 months ago

2 replies

network isolation is very important too, that’s what lets people run 4 containers all listening on port 80

process isolation is less prominent

vbezhenar

4 months ago

1 reply

You can bind your application to 127.0.0.2 for one container and to 127.0.0.3 for another container. Both can listen on port 80 and both can communicate with each other. And you can run another container, binding to 1.2.3.4:80 and using it as reverse-router. You can use iptables/nftables to prevent undesired connections and manually (or with some scripting) crafted /etc/hosts for named hosts to point to those loopback addresses. Or just DNS server. It's all doable.

The only thing that you need is the ability to configure a target application to choose address to bind to. But any sane application have that configuration knob.

Of course things are much easier with network namespaces, but you can go pretty far with host network (and I'd say it might be easier to understand and manage).

cbluth

4 months ago

You can see why people like the docker experience, you can manage to do all that in a single interface, instead of one off scripts touching a ton of little things

mikepurvis

4 months ago

Process isolation is more about load management/balancing, which is more of a production concern than a development one.

lyu07282

4 months ago

3 replies

What I always wondered is why qcow2 + qemu never gave rise to a similar system, they support snapshots/backing-files so it should be possible to implement a system similar to docker? Instead what we got is just this terrible libvirt.

dboreham

4 months ago

We called it "VMware".

westurner

4 months ago

Containerd/nerdctl supports a number of snapshotter plugins: Nydus, e Stargz, SOCI: Seekable OCI, fuse-overlayfs;

containerd/stargz-snapshotter: https://github.com/containerd/stargz-snapshotter

containerd/nerdctl//docs/nydus.md: https://github.com/containerd/nerdctl/blob/main/docs/nydus.m... :

nydusify and Check Nydus image: https://github.com/dragonflyoss/nydus/blob/master/docs/nydus... :

> Nydusify provides a checker to validate Nydus image, the checklist includes image manifest, Nydus bootstrap, file metadata, and data consistency in rootfs with the original OCI image. Meanwhile, the checker dumps OCI & Nydus image information to output (default) directory.

nydus: https://github.com/dragonflyoss/nydus

awslabs/soci-snapshotter: https://github.com/awslabs/soci-snapshotter ; lazy start standard OCI images

/? lxc copy on write: https://www.google.com/search?q=lxc+copy+on+write : lxc-copy supports btrfs, zfs, lvm, overlayfs

lxc/incus: "Add OCI image support" https://github.com/lxc/incus/issues/908

opencontainers/image-spec; OCI Image spec: https://github.com/opencontainers/image-spec

opencontainers/distribution-spec; OCI Image distribution spec: https://github.com/opencontainers/distribution-spec

But then in the

opencontainers/runtime-spec//config.md OCI runtime spec TODO bundle config.json there is an example of a config.json https://github.com/opencontainers/runtime-spec/blob/main/con...

The LXC approach is to run systemd in the container.

The quadlet approach is to not run systemd /sbin/init in the container; instead create .container files in /etc/containers/systemd/ (rootful) or ~/.config/containers/systemd/*.container (for rootless) so that the host systemd manages and logs the container processes.

Then realized you said QEMU not LXC.

LXD: https://canonical.com/lxd :

> LXD provides both [QEMU,] KVM-based VMs and system containers based on LXC – that can run a full Linux OS – in a single open source virtualisation platform. LXD has numerous built-in management features, including live migration, snapshots, resource restrictions, projects and profiles, and governs the interaction with various storage and networking options.

From https://documentation.ubuntu.com/lxd/latest/reference/storag... :

> LXD supports the following storage drivers for storing images, instances and custom volumes:

> Btrfs, CephFS, Ceph Object, Ceph RBD, Dell PowerFlex, Pure Storage, HPE Alletra, Directory, LVM, ZFS

You can run Podman or Docker within an LXD host; with or without a backing storage pool. FWIU it's possible for containers in an LXD VM to use BTRFS, ZFS, or KVM storage drivers to create e.g. BTRFS subvolumes instead of running overlayfs within the VM by editing storage.conf.

everfrustrated

4 months ago

The short answer is docker concentrated on files, whereas other VM oriented tech concentrated on block devices.

Dockers is conceptually simpler for devs and the layer use case but has huge performance issues which is why it never went anywhere for non-docker classic IT type use cases.

tguvot

4 months ago

i tried to build at work something like docker around 2003-2004. was trying to solve problem of distribution/updates/rollblacks of software on network appliances that we made. overlay filesystems back then were immature/buggy so it went nowhere. loopback mounted system was not sufficient (don't remember why)

harrall

4 months ago

I was a very early adopter of Docker and what sold me was Dockerfiles.

A SINGLE regular text file that took regular shell commands and could build the same deployment from scratch every time and then be cleaned up in one command.

This was UNHEARD of. Every other solution required learning new languages, defining “modules,” creating sets of scripts, or doing a lot of extra things. None of that was steezy.

I was so sold on Dockerfiles that I figured that even if the Docker project died, my Dockerfiles would continue to live because other people would try copy the idea of Dockerfiles. Now it’s been 10 years and Docker and containerization has changed a lot but what hasn’t? Dockerfiles. My 10 year Dockerfiles are still valid. That’s how good they were.

spullara

4 months ago

1 reply

Solaris Zones (follow on to Solaris Containers) was pretty amazing.

https://en.wikipedia.org/wiki/Solaris_Containers

dboreham

4 months ago

1 reply

Quick note that all these things are pre-dated (by decades) by mainframe virtualization schemes such as MVS.

spullara

4 months ago

100%!