All of these problems are about dependencies. And dependencies are about the way...

ttymck · on July 27, 2023

If I understand correctly, Dockerfile, and image layers, encode that path, making it retrace-able, yes?

teraflop · on July 27, 2023

Not in general, no. Docker image layers are just snapshots of the filesystem changes that result from build commands. But there is nothing that guarantees that the effects of those commands are reproducible.

For example, it's incredibly common for Dockerfiles to contain commands like:

    RUN apt-get update && apt-get install -y foobar

which means when you build the container, you get whatever version of foobar happens to currently be in the repository that your base image points to. So you can easily end up in a situation where rebuilding an image gives you different behavior, even though nothing in your Dockerfile or build context changed.

hinkley · on July 27, 2023

Even the Docker support team is confused by this. Most of why I stopped engaging is that they would reject a feature because it lead to dockerfiles they said were 'not reproducible', but were just as reproducible as many of the core features.

Which is to say, not in the slightest.

Every time you do something in a Dockerfile that involves pulling something from the network, you don't have reproducibility. Unless you pull something using a cryptographic hash (which I've never actually seen anyone do), the network can give you a different answer every time. The answer might not even be idempotent (calling it once changes the result of calling it again).

My argument was the same as yours. Virtually every dockerfile has an 'update' call as the second or third layer, which means none of the repeatable layers afterward (like useradd or mkdir) are actually repeatable. You're building a castle on the sand.

Docker images are reproducible. Docker builds are not. And that's okay, as long as everyone accepts that to be true. And if Docker contributors can't accept that then we are all truly fucked, until something else comes along that retreads the ideas.

xvector · on July 27, 2023

Sounds like Nix is what you're looking for.

evilduck · on July 27, 2023

Which can conveniently produce Docker images.

Quekid5 · on July 27, 2023

Is there some documentation on exactly how to do this? I've been curious about this in the past, but my search-fu is failing me...

arianvanp · on July 27, 2023

There's a whole chapter on images (VM, docker, appimage) in the manual. I can recommend starting there. https://nixos.org/manual/nixpkgs/stable/#chap-images

The smallest example is:

    pkgs.dockerTools.buildLayeredImage {
      name = "hello";
      contents = [ pkgs.hello ];
    }

MichaelZuo · on July 27, 2023

Is there really a significant number of folks who are confused by the idea that networks can serve different files under the same name at different times?

Izkata · on July 27, 2023

> which means when you build the container, you get whatever version of foobar happens to currently be in the repository that your base image points to.

Well no, the layer won't be rebuilt if you already have it from yesterday or last week and that + previous lines in the Dockerfile didn't change. So yours would still be using the old version, but someone new to the project would get the most up-to-date version. Have to remember to skip the cache to get the most recent version.

coldtea · on July 27, 2023

>Well no, the layer won't be rebuilt if you already have it from yesterday or last week and that + previous lines in the Dockerfile didn't change.

When people say it needs to be reproducible, they don't mean "reproducible in the same host machine, with the same cache lying around".

a_bonobo · on July 27, 2023

It's why I like conda's yml files, every dependency listed with its version number and from which conda-channel it came from

smabie · on July 27, 2023

you can omit version numbers in Conda files as well tho

a_bonobo · on July 27, 2023

yes true! But by default, `conda env export > env.yml` will include the version numbers of the current environment, which is how I usually use it.

coding123 · on July 27, 2023

Yes but two things about this:

1) you can apt-get a specific version

2) even if you don't apt-get a specific version, if two developers build and then run the container (within some time frame), they will get the same version. It would be rare to get a different one on the same day.

And 3) You can still apt-get a specific version

And 4) That's still far and a way greater chance two developers will have the same version when using docker in this way than if they're both using two crazy different versions of Ubuntu locally on their desktop.

Also most developers that are working on the same codebase will push to their branch and circleCI will make it so there's only one docker image, that's tested and moves along to prod eventually.

imp0cat · on July 27, 2023

You can apt-get a specific version, but that is not a 100% guarantee that the contents of the package will be identical to what was there yesterday. It is absolutely possible to delete a package from repo and replace it with another one with same version.

pxc · on July 27, 2023

Distros don't do this and have communally enforced conventions for package revisions that don't involve updates/changes to the upstream source.

Proprietary software vendors and in-house corporate repos might pull sloppy crap here, though.

It is nice when your package manager reliably alerts you to changes like that, though! Binary package managers can't, really— although this feature could be added— but sourced-based ones often do. Nix and Guix enforce this categorically, including for upstream source tarballs for source builds. Even Homebrew includes hashes for its binaries, and provides options to enforce them (refusing to install packages whose recipes don't include hashes).

ParetoOptimal · on July 27, 2023

> even if you don't apt-get a specific version, if two developers build and then run the container (within some time frame)

All of these "maybe it'll work" type ideas are what make modern software so brittle.

dsr_ · on July 27, 2023

If I tell you that there's a remote code execution in libfoobar-1.03 through 1.15, how long does it take you to verify where libfoobar is installed, and what versions are in use? Remember, nobody ships an image layer of libfoobar, it's a common component, though not as common as openssl.

Is there one command, or one script, which can do that? You need that, basically daily.

Is there one command to rebuild with the new libfoobar-1.17, and at most one more to test for basic functionality? You need that, too.

Spivak · on July 27, 2023

I mean you're not gonna like the answer but in real life when herding cats the answer is setting up an image scanner and renovate, and calling it a day.

It's not like OS images are any better in this respect. I have bit my teeth long enough on software that depends on the base OS and not being able to infra upgrades. Bifurcating responsibility into app/platform is a breath of fresh air by comparison.

alexchantavy · on July 27, 2023

Yup this is a hard problem. Shameless plug to my blogpost on how we built something to do this: https://eng.lyft.com/vulnerability-management-at-lyft-enforc...

salawat · on July 27, 2023

....I can tell you which ones are used by the current linker on the system in what order.

ldconfig -p | grep libfoobar

If you go and spread the linkers all over hither and yon and make it nigh impossible to get results out of them, or don't bother to keep track of where you put things... Welp. Can't help ya there. Shoulda been tracking that all along.

Oh, excuse me... That'll only work for dynamically built things in the abscence of statically linked stuff or clever LD_PRELOAD shenanigans.

Hope ya don't do that.

Fact is. You're really never going to get away from keeping track of all the moving pieces. You're just shuffling complexity and layers of abstraction around.

bandrami · on July 27, 2023

Same problem with flatpak. How many versions of openssl are on my laptop? I have no idea!