Flying Blind: When software supply chains fail, all software is suspect.
Updated: Mar 9, 2020
“Click here to download”.
Easier execution demands more trust in what you’re downloading. Complete automation means blind faith. This ease begs a number of questions, starting with: "is this wise?", “is there a system of security?” and ending with: "How do you know this code has not been tampered with?". The answers are "no, no, and you don't"; ease of use means we fly blind.
Software passes through a remarkable number of hands and machines from coders to users. Here is one example of an open source "supply chain"; the supply chain for proprietary, commercial software is similar but details differ.
The code author uploads source code to a repository (GIT)
The software developer downloads a copy of the source to his / her computer from the repository.
(S)he makes a modification and (hopefully) builds and tests the modifications.
These patches are then submitted and reviewed for inclusion in a future release.
The patches are then committed to the source software repository for eventual release.
These modifications to a software package are then released. This release is either a tarball of all the source, or sometimes in modern source code systems, a cryptographic hash which identifies the release in that repository. An announcement of the new version is made to the world.
The next steps generally occur within the software distribution organization, or "distro": e.g. Red Hat, SuSE, Canonical, Debian, FreeBSD etc.
A package maintainer evaluates a version for release, by building, installing and testing it on their computer.
Any patches needed to make software work properly in the distro environment are applied to the pristine source.
New software source must be packaged by combining it with information on how to build that software, with options and configuration choices, with what packages of software is required to build the package, where to install the software in the file system, and what versions of what libraries to link, etc. The software source package will generate one or more binary packages for installation.
With the software complete, packages are uploaded for inclusion in the distribution.
Software distributions use build systems distinct from package maintainer systems that perform builds from source packages, to support multiple machine architectures. After Quality Assurance, these packages are moved to a "software repository" for release.
The software repository is mirrored (Debian Linux is mirrored at more than 230 sites worldwide) or supplied via a Content Distribution Network.
Packages may be combined at a software integrator with other packages to complete a final integrated product, which is itself redistributed by the system integrator.
The final software is then downloaded and installed on the customer's system (often by intermediaries, such as IT support or systems integrators).
Look at your smartphone, or car, or TV. The firm that brought this to market, Apple, or Toyota or Samsung, has tight control over the physical parts in it; managing the supply chains that bring in radio modems, gearboxes or screens is a top priority. Pre-installed software on a device adds further steps to the already long software supply chain. Then look at your phone or PC six months after you bought it. You have literally no idea where the code came from.
Actually: nobody does. Not Apple, not Google, not Samsung, and definitely not you.
And yet, any machine, person, or organization in the supply chain of any component can be compromised. The packages are protected from the original repository to installation by digital signatures, but repositories and mirror systems can be attacked directly or via man-in-the-middle attacks, or by stealing keys for signatures.
Any single link in this chain can fail, and your system becomes compromised.
We must provide resilience at all stages of software distribution. We must make the software supply chain rigorous, hygienic, reliable. But how?